Assigning the data pattern and date format to a data collector

Generally while creating a data collector, you need to assign a data pattern and date format. Data is indexed and formatted based on the data pattern (and date format) assigned.

This topic provides the following information about the different ways in which you can assign a data pattern and date format while creating a data collector.

Need for assigning a data pattern (or date format)

The data pattern tells the product how to format the data during indexing. You need to assign a data pattern based on the kind of data that you want to index. Because this information is used to format the data, it is important that you assign the correct data pattern (and optionally date format). A data pattern is usually required to detect the pattern in which the data is occurring so that fields can be extracted. Fields enhance the search capability because they enable you to create better searches which in turn enable a deeper root-cause analysis. Some kinds of data are expected to occur in a standard format, for example, Windows events. For such data, the data pattern does not need to be assigned. The product automatically indexes such data in the correct format.

The date format refers to the format in which the timestamp needs to be formatted. Data patterns normally contain details of the date format. Therefore, when you assign a data pattern, the matching date format is automatically selected. You can override this selection by manually selecting or adding a new date format. If you assign both – a pattern and a date format, the product uses the date format to index the timestamp and the pattern to index rest of the event data. Suppose if you cannot find a matching data pattern, you can choose to assign a matching date format instead and index rest of the data as free text.

Before assigning a data pattern (or date format)

Before assigning a data pattern (or date format), you need to ensure that:

  • The Directory Path or File Path input is already set so that the data collector knows which data to collect.
  • The File Encoding input is correctly set.

Automatically detecting the data pattern (or date format)

The data collected is indexed based on the data pattern selected. Therefore, it is important that you select a data pattern that matches the format in which your data occurs.

The Pattern list includes all the data patterns available in the system, including:

Similarly, the Date Format list includes all the date formats available in the system including those that are:

  • Part of an existing data pattern
  • Manually created at the time of data collector creation

Unless you definitely know the data pattern name (or the exact date format), selecting the correct data pattern (or date format) can be difficult. You can instead use the auto-detect feature to automatically detect a pattern matching the data that you want to collect. After doing this, you can browse through the filtered list to see a preview of the sample records.

To automatically detect the matching data patterns, click Auto-Detect next to Date Format. The auto-detect feature first tries to find matching data patterns. If it does not find matching data patterns, then it descends to find matching date formats. If it does not find matching date formats also, then it provides the option to index the data as free text.

Note

If you assign the date format only, then the timestamp is formatted as per the date format while the rest of the data is indexed as free text.

Generally, selecting only the date format provides search results that might not accommodate a richer categorization of data and this might restrict advanced searching capabilities.

If you are satisfied with one of the matches available, you need to assign it to the data collector. To do this, select the data pattern name (or date format) in the filtered list and click Apply Pattern. If you selected a data pattern, you will notice that the date format is automatically selected. If you selected a date format, then you will notice that the Pattern field is populated with the option Free Text with Timestamp, which means all data except the timestamp will be indexed as free text.

You can change the date format selection, by manually selecting a matching item in the Date Format list. You can also create a new date format by selecting Create New Date Format. Similarly, you can change the data pattern selection by manually selecting an item from the Pattern list. For more information, see Manually assigning a data pattern (or date format).

Note

If you select both – a pattern and a date format, the product uses the date format to index the timestamp and the pattern to index rest of the event data.

Manually assigning a data pattern (or date format)

Even though you can use any method to assign a data pattern (or date format), using the auto-detect feature is recommended over the manual method.

However, there are some scenarios where a manual selection is recommended or might be more useful:

  • If you are collecting JSON data.
    You need to manually select the option to collect JSON data with (or without) timestamp from the Pattern list.
  • If the date format included in the data pattern does not exactly match the timestamp string format in the data.
    You can manually select a date format from Date Format list and override the date format included in the selected data pattern.
  • If you cannot find a matching data pattern or date format, then you can create a new date format by selecting the appropriate option in the Date Format list.
    For more information, see Creating a new date format.

Other scenarios where you might want to manually select a data pattern can be:

  • If you are certain about an existing data pattern name (or date format) that matches the data.
  • If you want to index the data as free text.
    You can select the option to index the data as free text with (or without) timestamp.

After selecting the correct data pattern in the Pattern list, you can preview the sample records. Previewing results after selecting a date format is not supported.

Previewing data to test and modify the data pattern (or date format)

The preview feature allows you to see the effect of assigning a data pattern (or date format) to the data. This feature lets you to preview sample set of records without actually indexing the data with the selected data pattern (or date format). You can assign a data pattern (or date format) by using the auto-detect feature or by manually selecting the correct pattern.

To see a preview of the sample records, depending on the way in which you assigned the data you can proceed as follows:

  • If you are using the auto-detect feature: Click Auto-Detect to see a list of matching data patterns (or date formats). Then, click the data pattern name (or date format) on the left of the filtered list, to displays the preview on the right.
  • If you are manually assigning the pattern: Select the data pattern and then click Preview next to Date Format.
    If the assigned pattern does not match the data, an error message indicating the same is displayed.
  • If you are manually assigning the date format: Select a date format, then select a matching data pattern. Then click Preview next to Date Format.
    If you cannot find a matching data pattern, then you can select Free Text with Timestamp.

Creating a new data pattern

If you are not satisfied with the results arising out of your data patterns available, select Add Data Pattern available at the end of the Pattern list. By selecting this option, you are redirected to the to the Create Data Pattern page under the Administration > Data Patterns tab. On the redirection, your currently specified data collector inputs are lost.

Creating a new date format

If your data contains a timestamp and you cannot find matching data patterns or date formats, then it is recommended that you create a new date format. By doing this, the date format will be used for indexing the timestamp string while rest of the data will be indexed as free text.

A date format can be created in the following ways:

  • As a part of the manual data pattern creation on the Administration > Data Patterns tab.
  • As a part of the data collector creation.

To create a date format during data collector creation, in the Date Format list, select Create new Date Format. Then, manually enter the correct format in the Date Format field.

In this case, the timestamp is indexed as per the defined format, while rest of the data is indexed as free text.

Example

If your data file contains the timestamp, "28 Apr 2014 10:58:28", then your date format must be dd MMM yyyy HH:mm:ss.

For more examples, see Sample date formats.

Using free text as your data pattern

If you cannot find matching data patterns, then you can consider selecting one of the following options as your data pattern. These options are available in the Pattern list.

  • Free Text with Timestamp: Useful when you there is matching date format.
    This option uses the date format to capture the timestamp and the rest of the data appears in raw format.
  • Free Text without Timestamp: Useful when no matches are found and when the data does not contain a timestamp.
    This option parses all the records as raw data.

    Note

    All the records processed using this option are assumed to be a single line of data with a line terminator at the end of the event. Records are distinguished on the basis of the new line separator.

    If you want to distinguish records in a custom way, then you can specify a custom string or regular expression in the Event Delimiter box that decides where the new line starts in the data. This string or regular expression must correspond to some text in your data which appears at the beginning of a line.

    Examples of using the event delimiter setting

    The following regular expression distinguishes records when the line starts with "INFO" or "ERROR" or "WARN".

    ^(INFO|WARN|ERROR)

    The following regular expression distinguishes records when the line starts with “com.bmc.ola”.

    ^(com\.bmc\.ola)

If your data contains a timestamp, but you cannot find matching data patterns or date formats, then it is recommended that you create a new date format. For more information, see Creating a new date format.

Assigning a data pattern for collecting JSON data

JSON data usually contains key-value pairs and nested objects that you might want to extract as fields (name=value pairs).

The following data patterns allow you to extract name=value pairs from JSON data. To assign this kind of pattern, you need to manually select it from the Pattern list.

  • JSON with Timestamp: If the JSON data contains a timestamp.
    In this case, you need to indicate the key whose value must be considered as the timestamp. To do this, enter the key name in the Timestamp Field box.
  • JSON without Timestamp: If the JSON data does not contains a timestamp.

After selecting the correct data pattern, you can preview the sample records.

Example of assigning a pattern for JSON data with timestamp

Use the following example to understand the effect of assigning a pattern for JSON data with timestamp:

Scenario

Suppose you want to collect and index the following sample data:

Sample JSON data (with timestamp)
{
    "clientip": "82.29.231.241",
    "date": "19/05/2016 03:31:44",
    "request": "/plants_store?cart.do.action=purchase&signoff.do",
    "product_id": "FL-DSH-02",
    "agent": "Mozilla 5.0 x11: U Linux i686; en-US; rv: 1.8.0.10) Gecko 20070223 CentOS 1.5.0.10-0.1.1l4.centos FireFox 1.5.0.10",
    "verb": "PUT",
    "referrer": "http://mystore.bmc.com//dog_store?order.do&action=purchase&JSESSION=f1ca05cd-c079-",
    "response": "RP-LI-02",
    "JSESSIONID": "f1ca05cd-c079-",
    "ident": "-",
    "auth": "-",
    "bytes": 1020,
    "num1": 3630,
    "num2": 19113,
    "httpversion": "1.1",
    "category_id": "PLANTS",
    "headers": {
        "user-agent": "CLMAgent/1.0",
        "connection": "persistent",
        "accept": "*/*"
    },
    "action": "purchase"
}

Assigning the data pattern

While creating the data collector, manually select JSON with Timestamp in the Pattern list, then manually select a matching date format from the Date Format list, and in the Timestamp Field enter "date".

Note that the Timestamp Field must contain the field name that corresponds to the timestamp in the JSON data.

Output

After the data is indexed, the following output is available. Observe that the JSON fields that are automatically extracted as name=value pairs in the output.

The output displays the timestamp as the date field value (available in the JSON data).

{
    "clientip": "82.29.231.241",
    "date": "19/05/2016 03:31:44",
    "request": "/plants_store?cart.do.action=purchase&signoff.do",
    "product_id": "FL-DSH-02",
    "agent": "Mozilla 5.0 x11: U Linux i686; en-US; rv: 1.8.0.10) Gecko 20070223 CentOS 1.5.0.10-0.1.1l4.centos FireFox 1.5.0.10",
    "verb": "PUT",
    "referrer": "http://mystore.bmc.com//dog_store?order.do&action=purchase&JSESSION=f1ca05cd-c079-",
    "response": "RP-LI-02",
    "JSESSIONID": "f1ca05cd-c079-",
    "ident": "-",
    "auth": "-",
    "bytes": 1020,
    "num1": 3630,
    "num2": 19113,
    "httpversion": "1.1",
    "category_id": "PLANTS",
    "headers": {
        "user-agent": "CLMAgent/1.0",
        "connection": "persistent",
        "accept": "*/*"
    },
    "action": "purchase"
}

 COLLECTOR_NAME=Access type A Collector    |  HOST=myhost.bmc.com    |  COLLECTOR=Access type A.txt    |  DATA_PATTERN=JSON with Timestamp    |  headers.connection=persistent    |  request=/plants_store?cart.do.action=purchase&signoff.do    |  agent=Mozilla 5.0 x11: U Linux i686; en-US; rv: 1.8.0.10) Gecko 20070223 CentOS 1.5.0.10-0.1.1l4.centos FireFox 1.5.0.10    |  auth=-    |  ident=-    |  JSESSIONID=f1ca05cd-c079-    |  category_id=PLANTS    |  headers.user-agent=CLMAgent/1.0    |  clientip=82.29.231.241    |  product_id=FL-DSH-02    |  num1=3630    |  action=purchase,purchase,purchase    |  num2=19113    |  JSESSION=f1ca05cd-c079-    |  headers.accept=*/*    |  verb=PUT    |  referrer=http://mystore.bmc.com//dog_store?order.do&action=purchase&JSESSION=f1ca05cd-c079-    |  response=RP-LI-02    |  bytes=1020    |  httpversion=1.1 

Example of assigning a pattern for JSON data without timestamp

Use the following example to understand the effect of assigning a pattern for JSON data without timestamp:

Scenario

Suppose you want to collect and index the following sample data:

Sample JSON data (with timestamp)
{
    "clientip": "82.29.231.241",
    "request": "/plants_store?cart.do.action=purchase&signoff.do",
    "product_id": "FL-DSH-02",
    "agent": "Mozilla 5.0 x11: U Linux i686; en-US; rv: 1.8.0.10) Gecko 20070223 CentOS 1.5.0.10-0.1.1l4.centos FireFox 1.5.0.10",
    "verb": "PUT",
    "referrer": "http://mystore.bmc.com//dog_store?order.do&action=purchase&JSESSION=f1ca05cd-c079-",
    "response": "RP-LI-02",
    "JSESSIONID": "f1ca05cd-c079-",
    "ident": "-",
    "auth": "-",
    "bytes": 1020,
    "num1": 3630,
    "num2": 19113,
    "httpversion": "1.1",
    "category_id": "PLANTS",
    "headers": {
        "user-agent": "CLMAgent/1.0",
        "connection": "persistent",
        "accept": "*/*"
    },
    "action": "purchase"
}

Assigning the data pattern

While creating the data collector, manually select JSON without Timestamp in the Pattern list.

Output

After the data is indexed, the following output is available. Observe that the JSON fields that are automatically extracted as name=value pairs in the output.

The output displays the timestamp as the time when the event was indexed. The timestamp is determined by the timezone of the Collection Host (host on which the Collection Station or Collection Agent is located).

{
    "clientip": "82.29.231.241",
    "request": "/plants_store?cart.do.action=purchase&signoff.do",
    "product_id": "FL-DSH-02",
    "agent": "Mozilla 5.0 x11: U Linux i686; en-US; rv: 1.8.0.10) Gecko 20070223 CentOS 1.5.0.10-0.1.1l4.centos FireFox 1.5.0.10",
    "verb": "PUT",
    "referrer": "http://mystore.bmc.com//dog_store?order.do&action=purchase&JSESSION=f1ca05cd-c079-",
    "response": "RP-LI-02",
    "JSESSIONID": "f1ca05cd-c079-",
    "ident": "-",
    "auth": "-",
    "bytes": 1020,
    "num1": 3630,
    "num2": 19113,
    "httpversion": "1.1",
    "category_id": "PLANTS",
    "headers": {
        "user-agent": "CLMAgent/1.0",
        "connection": "persistent",
        "accept": "*/*"
    },
    "action": "purchase"
}

 COLLECTOR_NAME=Access type B Collector    |  HOST=myhost.bmc.com    |  COLLECTOR=Access type B.txt    |  DATA_PATTERN=JSON without Timestamp   |  headers.connection=persistent   |  request=/plants_store?cart.do.action=purchase&signoff.do   |  agent=Mozilla 5.0 x11: U Linux i686; en-US; rv: 1.8.0.10) Gecko 20070223 CentOS 1.5.0.10-0.1.1l4.centos FireFox 1.5.0.10   |  auth=-   |  ident=-   |  JSESSIONID=f1ca05cd-c079-   |  category_id=PLANTS   |  headers.user-agent=CLMAgent/1.0   |  clientip=82.29.231.241   |  product_id=FL-DSH-02   |  num1=3630   |  action=purchase,purchase,purchase   |  num2=19113   |  JSESSION=f1ca05cd-c079-   |  headers.accept=*/*   |  verb=PUT   |  referrer=http://mystore.bmc.com//dog_store?order.do&action=purchase&JSESSION=f1ca05cd-c079-   |  response=RP-LI-02   |  bytes=1020   |  httpversion=1.1
Was this page helpful? Yes No Submitting... Thank you

Comments