dedup search command
This search command removes data records that contain fields with duplicate values.
Records are removed based on the field names specified. Depending on the field name, the first instance(s) of the record(s) is kept, while the rest of the records are removed. You can specify a number (N) as the count of records (with duplicate field values) to be kept. For example, you can keep the first three records with the same value and remove the remaining duplicate records.
You can also see the number of duplicate records removed. This can be done by specifying the showDetails option. By doing this, the DuplicatesRemoved field is added to each record kept. The value of this field is a number that indicates the count of the records removed. Specifying the showDetails option is required for running the sortby and the multiple parameters.
If you want to sort the search results, then you can specify the sortby parameter based on a field. By default the sorting happens in a lexicographical order. If you want to sort the results based on a field with numeric values, then you also sort the results numerically.
If you want to further analyze your data to see values of another field in the removed records, then you can use the multiple parameter. This can help you see multiple values of the field for the records returned by the dedup command.
This topic contains the following information:
For a list of all search commands, see Search-commands.
Related topics
Syntax
dedup [N] [showDetails] <field>+ [sortby <sort-by-option>(<field>)] [multiple <field>]
In the preceding syntax, the following definitions apply:
- N indicates a number that represents the count of records with duplicate field values to be kept. By default, this number is 1.
- + indicates one or more similar expressions separated by a space
- [Expression] indicates it is optional.
- showDetails indicates whether you want to know the number of duplicate records removed for each unique field value. If you specify this option, the DuplicatesRemoved field is added to each record.
- <field> refers to field name on which you want to run this command or the field name on which you want to run the sortby or the multiple parameter.
sortby indicates an optional parameter that you can run on a field name. You can add this parameter to sort the search results based on the value of the field specified. You can use one of the following options for sorting the search results:
Option
Description
numSorts the search results in a numerical order.
strSorts the search results in a lexicographical order.
multiple <field> indicates an optional parameter that you run on a field name to see the unique values of that field occurring in the records returned by running this command.
Short examples
Example 1: Remove duplicate search results with the same ClientIp field value.
... | dedup ClientIp
Example 2: Remove duplicate search results containing the same values, for both the ClientIp and ResponseCode fields.
... | dedup ClientIp ResponseCode
Example 3: Remove duplicate search results with the same RequestType field value. Additionally, see the number of duplicate records removed and see the unique values of the ResponseSize field in the duplicate records removed.
... | dedup showDetails RequestType multiple ResponseSize
Example 4: Remove duplicate search results with the same RequestType field value. Additionally, see the number of duplicate records removed and sort the search results by the ResponseSize field in an ascending order.
... | dedup showDetails RequestType sortby num(ResponseSize)
Long examples
The following sample data and sample indexed data (displayed on the Search tab) will help you understand the examples of using the dedup command.
- Sample data
- Sample indexed data
- dedup (single field)
- dedup (multiple fields)
- dedup with showDetails
- showDetails and multiple
- showDetails and sortby (num)
- showDetails, multiple, and sortby (num)
Sample data
10.1.1.140 - - [11/Jul/2013:15:01:52 -0700] "GET /themes/ComBeta/images/bullet.png |
10.1.1.140 - - [11/Jul/2013:15:02:52 -0700] "GET /themes/ComBeta/images/bullet.png |
10.1.1.141 - - [11/Jul/2013:15:03:52 -0700] "PUT /themes/ComBeta/images/bullet.png |
10.1.1.141 - - [11/Jul/2013:15:04:52 -0700] "POST /themes/ComBeta/images/bullet.png |
Sample indexed data
10.1.1.141 - - [11/Jul/2013:15:04:52 -0700] "POST /themes/ComBeta/images/bullet.png HOST=local.bmc.com |ResponseSize=100|COLLECTOR_NAME=u4 |ClientIp=10.1.1.141 |ResponseCode=200 |RequestType=POST|RequestURL=/themes/ComBeta/images/bullet.png |
10.1.1.141 - - [11/Jul/2013:15:03:52 -0700] "PUT /themes/ComBeta/images/bullet.png HOST=local.bmc.com |ResponseSize=200|COLLECTOR_NAME=u4 |ClientIp=10.1.1.141 |ResponseCode=201 |RequestType=PUT|RequestURL=/themes/ComBeta/images/bullet.png |
10.1.1.140 - - [11/Jul/2013:15:02:52 -0700] "GET /themes/ComBeta/images/bullet.png HOST=local.bmc.com |ResponseSize=150|COLLECTOR_NAME=u4 |ClientIp=10.1.1.140 |ResponseCode=201 |RequestType=GET|RequestURL=/themes/ComBeta/images/bullet.png |
10.1.1.140 - - [11/Jul/2013:15:01:52 -0700] "GET /themes/ComBeta/images/bullet.png HOST=local.bmc.com |ResponseSize=100|COLLECTOR_NAME=u4 |ClientIp=10.1.1.140 |ResponseCode=404 |RequestType=GET|RequestURL=/themes/ComBeta/images/bullet.png |
dedup (single field)
In this example, you use the command to remove duplicate search results with the same ClientIp field value.
Command
... | dedup ClientIp
Output
10.1.1.141 - - [11/Jul/2013:15:04:52 -0700] "POST /themes/ComBeta/images/bullet.png COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.141 |ResponseCode=200 |ResponseSize=100 |RequestType=POST |RequestURL=/themes/ComBeta/images/bullet.png |
10.1.1.140 - - [11/Jul/2013:15:02:52 -0700] "GET /themes/ComBeta/images/bullet.png COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.140 |ResponseCode=201 |ResponseSize=150 |RequestType=GET |RequestURL=/themes/ComBeta/images/bullet.png |
dedup (multiple fields)
In this example, you use the command to remove duplicate search results containing the same values, for both the ClientIp and ResponseCode fields.
Command
... | dedup ClientIp ResponseCode
Output
10.1.1.141 - - [11/Jul/2013:15:04:52 -0700] "POST /themes/ComBeta/images/bullet.png COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.141 |ResponseCode=200 |ResponseSize=100 |RequestType=POST |RequestURL=/themes/ComBeta/images/bullet.png |
10.1.1.141 - - [11/Jul/2013:15:03:52 -0700] "PUT /themes/ComBeta/images/bullet.png COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.141 |ResponseCode=201 |ResponseSize=200 |RequestType=PUT |RequestURL=/themes/ComBeta/images/bullet.png |
10.1.1.140 - - [11/Jul/2013:15:02:52 -0700] "GET /themes/ComBeta/images/bullet.png COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.140 |ResponseCode=201 |ResponseSize=150 |RequestType=GET |RequestURL=/themes/ComBeta/images/bullet.png |
10.1.1.140 - - [11/Jul/2013:15:01:52 -0700] "GET /themes/ComBeta/images/bullet.png COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.140 |ResponseCode=404 |ResponseSize=100 |RequestType=GET |RequestURL=/themes/ComBeta/images/bullet.png |
dedup with showDetails
In this example, you use the command to perform the following actions:
- Remove duplicate search results with the same RequestType field value.
- See the number of duplicate records removed (indicated by the DuplicatesRemoved field).
Command
... | dedup showDetails RequestType
Output
10.1.1.141 - - [11/Jul/2013:15:04:52 -0700] "POST /themes/ComBeta/images/bullet.png COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.141 |ResponseCode=200 |ResponseSize=100 |RequestType=POST |RequestURL=/themes/ComBeta/images/bullet.png |DuplicatesRemoved=0 |
10.1.1.141 - - [11/Jul/2013:15:03:52 -0700] "PUT /themes/ComBeta/images/bullet.png COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.141 |ResponseCode=201 |ResponseSize=200 |RequestType=PUT |RequestURL=/themes/ComBeta/images/bullet.png |DuplicatesRemoved=0 |
10.1.1.140 - - [11/Jul/2013:15:02:52 -0700] "GET /themes/ComBeta/images/bullet.png COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.140 |ResponseCode=201 |ResponseSize=150 |RequestType=GET |RequestURL=/themes/ComBeta/images/bullet.png |DuplicatesRemoved=1 |
showDetails and multiple
In this example, you use the command to perform the following actions:
- Remove duplicate search results with the same RequestType field value.
- See the number of duplicate records removed (indicated by the DuplicatesRemoved field).
- See the unique values of the ResponseSize field in the duplicate records removed (indicated by the ResponseSize:UniqueValues field).
Command
... | dedup showDetails RequestType multiple ResponseSize
Output
10.1.1.141 - - [11/Jul/2013:15:04:52 -0700] "POST /themes/ComBeta/images/bullet.png COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.141 |ResponseCode=200 |ResponseSize=100 |RequestType=POST |RequestURL=/themes/ComBeta/images/bullet.png |DuplicatesRemoved=0|ResponseSize:UniqueValues=100 |
10.1.1.141 - - [11/Jul/2013:15:03:52 -0700] "PUT /themes/ComBeta/images/bullet.png COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.141 |ResponseCode=201 |ResponseSize=200 |RequestType=PUT |RequestURL=/themes/ComBeta/images/bullet.png |DuplicatesRemoved=0|ResponseSize:UniqueValues=200 |
10.1.1.140 - - [11/Jul/2013:15:02:52 -0700] "GET /themes/ComBeta/images/bullet.png COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.140 |ResponseCode=201 |ResponseSize=150 |RequestType=GET |RequestURL=/themes/ComBeta/images/bullet.png |DuplicatesRemoved=1|ResponseSize:UniqueValues=150,100 |
showDetails and sortby (num)
In this example, you use the command to perform the following actions:
- Remove duplicate search results with the same RequestType field value.
- See the number of duplicate records removed (indicated by the DuplicatesRemoved field).
- Sort the search results by the ResponseSize field in an ascending order.
Command
... | dedup showDetails RequestType sortby num(ResponseSize)
Output
10.1.1.141 - - [11/Jul/2013:15:04:52 -0700] "POST /themes/ComBeta/images/bullet.png COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.141 |ResponseCode=200 |ResponseSize=100 |RequestType=POST |RequestURL=/themes/ComBeta/images/bullet.png |DuplicatesRemoved=0 |
10.1.1.140 - - [11/Jul/2013:15:02:52 -0700] "GET /themes/ComBeta/images/bullet.png COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.140 |ResponseCode=201 |ResponseSize=150 |RequestType=GET |RequestURL=/themes/ComBeta/images/bullet.png |DuplicatesRemoved=1 |
10.1.1.141 - - [11/Jul/2013:15:03:52 -0700] "PUT /themes/ComBeta/images/bullet.png COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.141 |ResponseCode=201 |ResponseSize=200 |RequestType=PUT |RequestURL=/themes/ComBeta/images/bullet.png |DuplicatesRemoved=0 |
showDetails, multiple, and sortby (num)
In this example, you use the command to perform the following actions:
- Remove duplicate search results with the same RequestType field value.
- See the number of duplicate records removed (indicated by the DuplicatesRemoved field).
- See the unique values of the ResponseSize field in the duplicate records removed (indicated by the ResponseSize:UniqueValues field).
- Sort the search results by the ResponseSize field in an ascending order.
Command
... | dedup showDetails RequestType multiple ResponseSize sortby num(ResponseSize)
Output
10.1.1.141 - - [11/Jul/2013:15:04:52 -0700] "POST /themes/ComBeta/images/bullet.png COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.141 |ResponseCode=200 |ResponseSize=100 |RequestType=POST |RequestURL=/themes/ComBeta/images/bullet.png |DuplicatesRemoved=0|ResponseSize:UniqueValues=100 |
10.1.1.140 - - [11/Jul/2013:15:02:52 -0700] "GET /themes/ComBeta/images/bullet.png COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.140 |ResponseCode=201 |ResponseSize=150 |RequestType=GET |RequestURL=/themes/ComBeta/images/bullet.png |DuplicatesRemoved=1|ResponseSize:UniqueValues=150,100 |
10.1.1.141 - - [11/Jul/2013:15:03:52 -0700] "PUT /themes/ComBeta/images/bullet.png COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.141 |ResponseCode=201 |ResponseSize=200 |RequestType=PUT |RequestURL=/themes/ComBeta/images/bullet.png |DuplicatesRemoved=0|ResponseSize:UniqueValues=200 |
Notes
By specifying the showDetails option, the search results are displayed only after all the events are processed. Therefore, if you run this option on a large volume of data, the search execution might take longer to complete.