dedup search command

This search command removes data records that contain fields with duplicate values.

Records are removed based on the field names specified. Depending on the field name, the first instance(s) of the record(s) is kept, while the rest of the records are removed. You can specify a number (N) as the count of records (with duplicate field values) to be kept. For example, you can keep the first three records with the same value and remove the remaining duplicate records.

You can also see the number of duplicate records removed. This can be done by specifying the showDetails option. By doing this, the DuplicatesRemoved field is added to each record kept. The value of this field is a number that indicates the count of the records removed. Specifying the showDetails option is required for running the sortby and the multiple parameters.

If you want to sort the search results, then you can specify the sortby parameter based on a field. By default the sorting happens in a lexicographical order. If you want to sort the results based on a field with numeric values, then you also sort the results numerically.

If you want to further analyze your data to see values of another field in the removed records, then you can use the multiple parameter. This can help you see multiple values of the field for the records returned by the dedup command.

Example

Suppose you run the dedup command on the HOST field. You find five records with unique values of the HOST field. You can see the data collector names associated with these five records.

Now suppose you want to know the data collector names (unique instances) occurring in all the duplicate records removed. For this, you can run the multiple parameter on the COLLECTOR_NAME field. By doing this, another field with the same name (COLLECTOR_NAME) is added in the format, <FieldName>:UniqueValues=Value1,Value2,Value3.

In this way, the multiple parameter helps you correlate the data collector names with the host names.

This topic contains the following information:

For a list of all search commands, see Search commands.

Syntax

dedup [N] [showDetails] <field>+ [sortby <sort-by-option>(<field>)] [multiple <field>]

In the preceding syntax, the following definitions apply:

  • N indicates a number that represents the count of records with duplicate field values to be kept. By default, this number is 1.
  • + indicates one or more similar expressions separated by a space
  • [Expression] indicates it is optional.
  • showDetails indicates whether you want to know the number of duplicate records removed for each unique field value. If you specify this option, the DuplicatesRemoved field is added to each record.
  • <field> refers to field name on which you want to run this command or the field name on which you want to run the sortby or the multiple parameter.
  • sortby indicates an optional parameter that you can run on a field name. You can add this parameter to sort the search results based on the value of the field specified. You can use one of the following options for sorting the search results:

    OptionDescription
    numSorts the search results in a numerical order.
    strSorts the search results in a lexicographical order.

    Notes

    • Even if you do not specify the str option and only specify the field name, by default the search results are sorted in a lexicographical order.
    • This parameter can be used only when your command syntax uses the showDetails option.
  • multiple <field> indicates an optional parameter that you run on a field name to see the unique values of that field occurring in the records returned by running this command.

    Note

    This parameter can be used only when your command syntax uses the showDetails option.

Short examples

Example 1: Remove duplicate search results with the same ClientIp field value.

... | dedup ClientIp

Example 2: Remove duplicate search results containing the same values, for both the ClientIp and ResponseCode fields.

... | dedup ClientIp ResponseCode

Example 3: Remove duplicate search results with the same RequestType field value. Additionally, see the number of duplicate records removed and see the unique values of the ResponseSize field in the duplicate records removed.

... | dedup showDetails RequestType multiple ResponseSize

Example 4: Remove duplicate search results with the same RequestType field value. Additionally, see the number of duplicate records removed and sort the search results by the ResponseSize field in an ascending order.

... | dedup showDetails RequestType sortby num(ResponseSize)

Long examples

The following sample data and sample indexed data (displayed on the Search tab) will help you understand the examples of using the dedup command.

Sample data

10.1.1.140 - - [11/Jul/2013:15:01:52 -0700] "GET /themes/ComBeta/images/bullet.png 
HTTP/1.1" 404 100
10.1.1.140 - - [11/Jul/2013:15:02:52 -0700] "GET /themes/ComBeta/images/bullet.png 
HTTP/1.1" 201 150
10.1.1.141 - - [11/Jul/2013:15:03:52 -0700] "PUT /themes/ComBeta/images/bullet.png 
HTTP/1.1" 201 200
10.1.1.141 - - [11/Jul/2013:15:04:52 -0700] "POST /themes/ComBeta/images/bullet.png 
HTTP/1.1" 200 100

Back to examples ↑

Sample indexed data

10.1.1.141 - - [11/Jul/2013:15:04:52 -0700] "POST /themes/ComBeta/images/bullet.png 
HTTP/1.1" 200 100
HOST=local.bmc.com |ResponseSize=100|COLLECTOR_NAME=u4 |ClientIp=10.1.1.141 |ResponseCode=200 |RequestType=POST|RequestURL=/themes/ComBeta/images/bullet.png
10.1.1.141 - - [11/Jul/2013:15:03:52 -0700] "PUT /themes/ComBeta/images/bullet.png 
HTTP/1.1" 201 200
HOST=local.bmc.com |ResponseSize=200|COLLECTOR_NAME=u4 |ClientIp=10.1.1.141 |ResponseCode=201 |RequestType=PUT|RequestURL=/themes/ComBeta/images/bullet.png
10.1.1.140 - - [11/Jul/2013:15:02:52 -0700] "GET /themes/ComBeta/images/bullet.png 
HTTP/1.1" 201 150
HOST=local.bmc.com |ResponseSize=150|COLLECTOR_NAME=u4 |ClientIp=10.1.1.140 |ResponseCode=201 |RequestType=GET|RequestURL=/themes/ComBeta/images/bullet.png
10.1.1.140 - - [11/Jul/2013:15:01:52 -0700] "GET /themes/ComBeta/images/bullet.png 
HTTP/1.1" 404 100
HOST=local.bmc.com |ResponseSize=100|COLLECTOR_NAME=u4 |ClientIp=10.1.1.140 |ResponseCode=404 |RequestType=GET|RequestURL=/themes/ComBeta/images/bullet.png

Back to examples ↑

dedup (single field)

In this example, you use the command to remove duplicate search results with the same ClientIp field value.

Command

... | dedup ClientIp

Output

10.1.1.141 - - [11/Jul/2013:15:04:52 -0700] "POST /themes/ComBeta/images/bullet.png 
HTTP/1.1" 200 100
COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.141 |ResponseCode=200 |ResponseSize=100 |RequestType=POST |RequestURL=/themes/ComBeta/images/bullet.png
10.1.1.140 - - [11/Jul/2013:15:02:52 -0700] "GET /themes/ComBeta/images/bullet.png 
HTTP/1.1" 201 150
COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.140 |ResponseCode=201 |ResponseSize=150 |RequestType=GET |RequestURL=/themes/ComBeta/images/bullet.png

Back to examples ↑

dedup (multiple fields)

In this example, you use the command to remove duplicate search results containing the same values, for both the ClientIp and ResponseCode fields.

Command

... | dedup ClientIp ResponseCode

Output

10.1.1.141 - - [11/Jul/2013:15:04:52 -0700] "POST /themes/ComBeta/images/bullet.png 
HTTP/1.1" 200 100
COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.141 |ResponseCode=200 |ResponseSize=100 |RequestType=POST |RequestURL=/themes/ComBeta/images/bullet.png
10.1.1.141 - - [11/Jul/2013:15:03:52 -0700] "PUT /themes/ComBeta/images/bullet.png 
HTTP/1.1" 201 200
COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.141 |ResponseCode=201 |ResponseSize=200 |RequestType=PUT |RequestURL=/themes/ComBeta/images/bullet.png
10.1.1.140 - - [11/Jul/2013:15:02:52 -0700] "GET /themes/ComBeta/images/bullet.png 
HTTP/1.1" 201 150
COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.140 |ResponseCode=201 |ResponseSize=150 |RequestType=GET |RequestURL=/themes/ComBeta/images/bullet.png
10.1.1.140 - - [11/Jul/2013:15:01:52 -0700] "GET /themes/ComBeta/images/bullet.png 
HTTP/1.1" 404 100
COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.140 |ResponseCode=404 |ResponseSize=100 |RequestType=GET |RequestURL=/themes/ComBeta/images/bullet.png

Back to examples ↑

dedup with showDetails

In this example, you use the command to perform the following actions:

  • Remove duplicate search results with the same RequestType field value.
  • See the number of duplicate records removed (indicated by the DuplicatesRemoved field).

Command

... | dedup showDetails RequestType

Output

10.1.1.141 - - [11/Jul/2013:15:04:52 -0700] "POST /themes/ComBeta/images/bullet.png 
HTTP/1.1" 200 100
COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.141 |ResponseCode=200 |ResponseSize=100 |RequestType=POST |RequestURL=/themes/ComBeta/images/bullet.png |DuplicatesRemoved=0
10.1.1.141 - - [11/Jul/2013:15:03:52 -0700] "PUT /themes/ComBeta/images/bullet.png 
HTTP/1.1" 201 200
COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.141 |ResponseCode=201 |ResponseSize=200 |RequestType=PUT |RequestURL=/themes/ComBeta/images/bullet.png |DuplicatesRemoved=0
10.1.1.140 - - [11/Jul/2013:15:02:52 -0700] "GET /themes/ComBeta/images/bullet.png 
HTTP/1.1" 201 150
COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.140 |ResponseCode=201 |ResponseSize=150 |RequestType=GET |RequestURL=/themes/ComBeta/images/bullet.png |DuplicatesRemoved=1

Back to examples ↑

showDetails and multiple

In this example, you use the command to perform the following actions:

  • Remove duplicate search results with the same RequestType field value.
  • See the number of duplicate records removed (indicated by the DuplicatesRemoved field).
  • See the unique values of the ResponseSize field in the duplicate records removed (indicated by the ResponseSize:UniqueValues field).

Command

... | dedup showDetails RequestType multiple ResponseSize

Output

10.1.1.141 - - [11/Jul/2013:15:04:52 -0700] "POST /themes/ComBeta/images/bullet.png 
HTTP/1.1" 200 100
COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.141 |ResponseCode=200 |ResponseSize=100 |RequestType=POST |RequestURL=/themes/ComBeta/images/bullet.png |DuplicatesRemoved=0|ResponseSize:UniqueValues=100
10.1.1.141 - - [11/Jul/2013:15:03:52 -0700] "PUT /themes/ComBeta/images/bullet.png 
HTTP/1.1" 201 200
COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.141 |ResponseCode=201 |ResponseSize=200 |RequestType=PUT |RequestURL=/themes/ComBeta/images/bullet.png |DuplicatesRemoved=0|ResponseSize:UniqueValues=200
10.1.1.140 - - [11/Jul/2013:15:02:52 -0700] "GET /themes/ComBeta/images/bullet.png 
HTTP/1.1" 201 150
COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.140 |ResponseCode=201 |ResponseSize=150 |RequestType=GET |RequestURL=/themes/ComBeta/images/bullet.png |DuplicatesRemoved=1|ResponseSize:UniqueValues=150,100

Back to examples ↑

showDetails and sortby (num)

In this example, you use the command to perform the following actions:

  • Remove duplicate search results with the same RequestType field value.
  • See the number of duplicate records removed (indicated by the DuplicatesRemoved field).
  • Sort the search results by the ResponseSize field in an ascending order. 

Command

... | dedup showDetails RequestType sortby num(ResponseSize)

Output

10.1.1.141 - - [11/Jul/2013:15:04:52 -0700] "POST /themes/ComBeta/images/bullet.png 
HTTP/1.1" 200 100
COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.141 |ResponseCode=200 |ResponseSize=100 |RequestType=POST |RequestURL=/themes/ComBeta/images/bullet.png |DuplicatesRemoved=0
10.1.1.140 - - [11/Jul/2013:15:02:52 -0700] "GET /themes/ComBeta/images/bullet.png 
HTTP/1.1" 201 150
COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.140 |ResponseCode=201 |ResponseSize=150 |RequestType=GET |RequestURL=/themes/ComBeta/images/bullet.png |DuplicatesRemoved=1
10.1.1.141 - - [11/Jul/2013:15:03:52 -0700] "PUT /themes/ComBeta/images/bullet.png 
HTTP/1.1" 201 200
COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.141 |ResponseCode=201 |ResponseSize=200 |RequestType=PUT |RequestURL=/themes/ComBeta/images/bullet.png |DuplicatesRemoved=0

Back to examples ↑

showDetails, multiple, and sortby (num)

In this example, you use the command to perform the following actions:

  • Remove duplicate search results with the same RequestType field value.
  • See the number of duplicate records removed (indicated by the DuplicatesRemoved field).
  • See the unique values of the ResponseSize field in the duplicate records removed (indicated by the ResponseSize:UniqueValues field).
     
  • Sort the search results by the ResponseSize field in an ascending order.

Command

... | dedup showDetails RequestType multiple ResponseSize sortby num(ResponseSize)

Output

10.1.1.141 - - [11/Jul/2013:15:04:52 -0700] "POST /themes/ComBeta/images/bullet.png 
HTTP/1.1" 200 100
COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.141 |ResponseCode=200 |ResponseSize=100 |RequestType=POST |RequestURL=/themes/ComBeta/images/bullet.png |DuplicatesRemoved=0|ResponseSize:UniqueValues=100
10.1.1.140 - - [11/Jul/2013:15:02:52 -0700] "GET /themes/ComBeta/images/bullet.png 
HTTP/1.1" 201 150
COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.140 |ResponseCode=201 |ResponseSize=150 |RequestType=GET |RequestURL=/themes/ComBeta/images/bullet.png |DuplicatesRemoved=1|ResponseSize:UniqueValues=150,100
10.1.1.141 - - [11/Jul/2013:15:03:52 -0700] "PUT /themes/ComBeta/images/bullet.png 
HTTP/1.1" 201 200
COLLECTOR_NAME=u4 |HOST=local.bmc.com |ClientIp=10.1.1.141 |ResponseCode=201 |ResponseSize=200 |RequestType=PUT |RequestURL=/themes/ComBeta/images/bullet.png |DuplicatesRemoved=0|ResponseSize:UniqueValues=200

Back to examples ↑

Notes

By specifying the showDetails option, the search results are displayed only after all the events are processed. Therefore, if you run this option on a large volume of data, the search execution might take longer to complete.

Was this page helpful? Yes No Submitting... Thank you

Comments