extract
This search command can be used to extract field values or raw event data that it then assigns to new fields by using the Java regular expression capturing groups. The extract command can be used to specify a regular expression in such a way that it matches the target field value (or raw event data) that you want to extract and then assigns the extracted values to the new fields specified. The regular expression specified must exactly match the field value (or raw event data) in the search results.
This topic contains the following information:
For a list of all search commands, see Search-commands.
Related topics
Syntax
extract field=[<Source-Field>] "<Regex-Expression>"
In the preceding syntax, the following definitions apply:
- <Source-Field> refers to the source field name that you want to use to extract particular information. Specifying this information is optional. If you do not specify a field name, the raw event data is used to extract particular information.
- <Regex-Expression> refers to the Java regular expression (capturing groups) that you want to specify. This expression must be a combination of the regular expression and the new field or fields to which you want to assign the extracted information. This expression must be enclosed in double quotes (").
- [Expression] indicates it is optional.
For more information about specifying the regular expression, see Specifying the regular expression correctly.
Short examples
Example 1: Extract the the log level data entry (example warning), and the corresponding component action data entry (example, VpxUtil_InvokeWithOpId) and assign the values to new fields, LogLevel and ComponentAction respectively.
... | extract field=".*?\[\d+\s+(?<LogLevel>\w+).*?\]\s+(?<ComponentAction>\w+).*"
Example 2: Extract two portions (host name and domain name) of the value of the HOST field and assign those values to two new fields, Hostname and Domainname.
... | extract field=HOST "(?<Hostname>[A-Za-z-]+)\.(?<Domainname>.+)"
Long examples
The following sample data and sample indexed data (displayed on the Search tab) will help you understand the examples of using the extract command.
- Sample data
- Sample indexed data
- Extract log level and component action
- Extract module name and time
- Extract value of the HOST field into two separate fields
Sample data
2014-11-18T15:50:53.872+05:30 [03140 warning 'VpxProfiler' |
Sample indexed data
2014-11-18T15:50:53.872+05:30 [03140 warning 'VpxProfiler' HOST=my-server.bmc.com |COLLECTOR_NAME=my_coll |opID=SWI-661f7bb8 |
Extract log level and component action
In this example, you use the command to extract the log level (warning) and the component (VpxUtil) along with the related action for that component (InvokeWithOpId).
When you run this command, two new fields are added to the output: LogLevel and ComponentAction.
Command
... | extract field=".*?\[\d+\s+(?<LogLevel>\w+).*?\]\s+(?<ComponentAction>\w+).*"
Output
2014-11-18T15:50:53.872+05:30 [03140 warning 'VpxProfiler' HOST=my-server.bmc.com |COLLECTOR_NAME=my_coll |opID=SWI-661f7bb8 |LogLevel=warning |ComponentAction=VpxUtil_InvokeWithOpId |
Extract module name and time
In this example, you use the command to extract the module name (VpxProfiler) and the time taken by the module (in milliseconds) (2750) for executing the respective operation (opID=SWI-661f7bb8).
When you run this command, two new fields are added to the output: Module and TimeInms.
Command
... | extract field=".*?'(?<Module>\w+).*?(?<TimeInms>\d+)\sms"
Output
2014-11-18T15:50:53.872+05:30 [03140 warning 'VpxProfiler' HOST=my-server.bmc.com |COLLECTOR_NAME=my_coll |opID=SWI-661f7bb8 |Module=VpxProfiler |TimeInms=2750 |
Extract value of the HOST field into two separate fields
In this example, you use the command to extract the host name (my-server) and domain name (bmc.com) separately from the value of the HOST field.
When you run this command, two new fields are added to the output: Hostname and Domainname.
Command
... | extract field=HOST "(?<Hostname>[A-Za-z-]+)\.(?<Domainname>.+)"
Output
2014-11-18T15:50:53.872+05:30 [03140 warning 'VpxProfiler' HOST=my-server.bmc.com |COLLECTOR_NAME=my_coll |opID=SWI-661f7bb8 |Hostname=my-server|Domainname=bmc.com |
Specifying the regular expression correctly
Suppose you have the following line in your indexed data:
ChartData found for searchId = 1401867925702, index = bw-2014-06-02-06-006, HOST=my-server.bmc.com |COLLECTOR_NAME=my_coll |
You want to extract the search ID information (1401867925702) and the index information (bw-2014-06-02-06-006).
If you use the following incorrect syntax, the actual output does not match the expected output.
Incorrect syntax
Expected output | Actual output |
---|---|
SearchID=1401867925702 and Index=bw-2014-06-02-06-006 | SearchID=24 and Index=52 |
This discrepancy occurs because the regular expression (.*) is a greedy quantifier and tries to match the maximum possible data. To be able to extract the exact information that you are looking for, you must use a reluctant (nongreedy) quantifier.
You can use the following correct syntax to extract the exact information for search ID and index.
Correct syntax
Notes
- The product supports only Java regular expressions that are compatible with Java Runtime Environment (JRE) version 1.6.
- The regular expression that you specify in the command must match the specified field value or raw data entry.
- You cannot use the default field names HOST, COLLECTOR_NAME, or DATA_PATTERN as the value of the target field.