extract
command can be used to specify a regular expression in such a way that it matches the target field value (or raw event data) that you want to extract and then assigns the extracted values to the new fields specified. The regular expression specified must exactly match the field value (or raw event data) in the search results.This topic contains the following information:
For a list of all search commands, see Search commands.
Suppose you have the following line in your indexed data:
ChartData found for searchId = 1401867925702, index = bw-2014-06-02-06-006, HOST=my-server.bmc.com |COLLECTOR_NAME=my_coll |
You want to extract the search ID information (1401867925702) and the index information (bw-2014-06-02-06-006).
If you use the following incorrect syntax, the actual output does not match the expected output.
Incorrect syntax
... | extract field=".*=\s*(?<SearchID>\d+).*=\s*(?<Index>\w+).*"
Expected output | Actual output |
---|---|
SearchID=1401867925702 and Index=bw-2014-06-02-06-006 | SearchID=24 and Index=52 |
This discrepancy occurs because the regular expression (.*) is a greedy quantifier and tries to match the maximum possible data. To be able to extract the exact information that you are looking for, you must use a reluctant (nongreedy) quantifier.
You can use the following correct syntax to extract the exact information for search ID and index.
Correct syntax
... | extract field=".*?=\s*(?<SearchID>\d+).*?=\s*(?<Index>\w+).*"
extract field=[<Source-Field>] "<Regex-Expression>"
In the preceding syntax, the following definitions apply:
<Source-Field>
refers to the source field name that you want to use to extract particular information. Specifying this information is optional. If you do not specify a field name, the raw event data is used to extract particular information.<Regex-Expression>
refers to the Java regular expression (capturing groups) that you want to specify. This expression must be a combination of the regular expression and the new field or fields to which you want to assign the extracted information. This expression must be enclosed in double quotes (").Example 1: Extract the value of the fetched records data entry (read records) and the remaining data entry (remaining records) and assign the values to new fields, ReadCount and RemainCount respectively.
... | extract field=".*=\s*(?<ReadCount>\d+).*=\s*(?<RemainCount>\w+).*"
Example 2: Extract two portions (host name and domain name) of the value of the HOST field and assign those values to two new fields, Hostname and Domainname.
... | extract field=HOST "(?<Hostname>[A-Za-z-]+)\.(?<Domainname>.+)"
The following sample data and sample indexed data (displayed on the Search tab) will help you understand the examples of using the extract
command.
ChartData found for searchId = 1401867925702, index = bw-2014-06-02-06-006, |
ChartData found for searchId = 1401867925702, index = bw-2014-06-02-06-006, HOST=my-server.bmc.com |COLLECTOR_NAME=my_coll |
In this example, you use the command to extract the records that are read and the records that are remaining.
When you run this command, two new fields are added to the output: ReadCount and RemainCount.
... | extract field=".*=\s*(?<ReadCount>\d+).*=\s*(?<RemainCount>\w+).*"
ChartData found for searchId = 1401867925702, index = bw-2014-06-02-06-006, HOST=my-server.bmc.com |COLLECTOR_NAME=my_coll |ReadCount=24|RemainCount=52 |
In this example, you use the command to extract the search ID and the index information from the search results.
When you run this command, two new fields are added to the output: SearchID and Index.
... | extract field=".*?=\s*(?<SearchID>\d+).*?=\s*(?<Index>\w+).*"
ChartData found for searchId = 1401867925702, index = bw-2014-06-02-06-006, HOST=my-server.bmc.com |COLLECTOR_NAME=my_coll |SearchID=1401867925702|Index=bw-2014-06-02-06-006 |
In this example, you use the command to extract the host name and domain name separately from the value of the HOSTfield.
When you run this command, two new fields are added to the output: Hostname and Domainname.
... | extract field=HOST "(?<Hostname>[A-Za-z-]+)\.(?<Domainname>.+)"
ChartData found for searchId = 1401867925702, index = bw-2014-06-02-06-006, HOST=my-server.bmc.com |COLLECTOR_NAME=my_coll |Hostname=my-server|Domainname=bmc.com |