Page tree

The easiest way of collecting data is to create a data collector from the Administration > Data Collectors tab. Data collectors collect data and send it to the Indexer for indexing. When you perform a search, the indexed data is made available as a series of individual records (or search results).

When creating a data collector, you need to provide various inputs such as details of the target host from where you plan to collect data, and other parameters such as the data pattern (matching the data that you want to collect), the date format to use for indexing the date and time string, the file encoding to use, and other advanced (but optional) settings. Before you create a data collector, you need to understand the kind of data that you want to collect and collate all the inputs required for creating the particular data collector type.

The following information can help you understand the data collector creation process and associated best practices.

Preparing for data collector creation

As a best practice, before you start creating data collectors, you must create a deployment plan. If you plan to configure a large number of data collectors for multiple applications, you must plan for your needs from various aspects to help you save time and make your data-collection process as efficient as possible. You can start by creating a list of sources of all the data that you want to collect and then fill in key properties about each data source. For example, the attached spreadsheet contains a list of sample inputs required while creating data collectors.

Icons and associated functions on the Data Collectors tab

The Data Collectors tab allows you to manage data collectors. To access this tab, navigate to Administration > Data Collectors.

The Data Collectors tab displays a default data collector that collects data available in the Collection_metrics.log file. The Dashboards tab displays a graph summarizing the data collected by this data collector. For more information, see Creating and managing dashboards. For more information about collecting data from the product metrics files, see Monitoring the product metric files.

You can perform the following actions on the Data Collectors tab.

ActionIconDescription
Add Data Collector

Add a new data collector.

The kind of data that you want to collect determines the type of data collector. The inputs required for collecting a data collector vary based on the data collector type. For more information, see Creating data collectors.

Best practice: Adding new data collectors must be done incrementally and followed with a simple validation phase. As a part of the validation, you can see if data collection has started and if you can perform searches.

Configuring data collectors incrementally has the following advantages:

  • If you provided any incorrect settings as part of the data-collector configuration, you do not have to start the whole process from the beginning; you need only re-edit or delete the existing data collectors and then re-create a smaller set of data collectors.
  • Assessing performance impact to the system is easier if you apply a validation phase between configurations.
  • Configuring like data collectors in one configuration session (for example, on one day) and then a different set of like data collectors in another session (for example, on another day) can be easy and simple.
Edit Data Collector

Edit the selected data collector.

You can modify the same details that you provided while adding a data collector.

Note: You cannot modify a data collector if the data collector is of the Upload File type.

View Data Collector

 
View details of the selected data collector
Delete Data Collector

Delete the selected data collectors. Optionally, select the Delete data for this Data Collector check box if you want to delete all the data collected by that data collector so that it is no longer available for searching.Click OK to confirm your action.

Note: There might be some residual data remaining in the system that was still being collected when you decided to delete the data collected by the data collector. Such data is deleted from the system when the data retention period is over.

Clone Data Collector

Make a copy of the selected data collector.

Collection Status History

(Last 10 polls status)

View the individual status of the last ten polls for a data collector. For more information, see Understanding the data collection status.

To refresh the list displayed, click Refresh Collection Status History . Note that this feature is not supported for the Upload File type of data collector because this data collector is meant for one time collection of data.

Start Data Collector(s)

Start the selected data collectors.

This action is not relevant for the Upload file type of data collector as it performs one-time data collection after the data collector is created.

Stop Data Collector(s)

Stop the selected data collectors that are already started.

  • If you want to avoid data collection during a particular time period (for example, during your maintenance window), you can stop the data collection. No data is collected during this time. The next time you start data collection, it begins from that point onward.
  • If the Indexer restarts after a long time, the result might be a sudden increase in data-collection traffic (from the Collection Station and Collection Agents). To avoid this problem, consider stopping all the data collectors (when the Indexer is down) and then restart the data collectors after the Indexer is up again.
Refresh Data Collector ListManually refresh the list of data collectors to see the latest poll status and any other updates made to the data collectors.
Change Maximum Data Retention Period

Change the maximum data retention period (in days) for the selected data collectors.

Use the following options to set the data retention period:

  • Default: Sets the value specified as the maximum data retention period at Administration > System Settings.
  • Custom: Allows you to specify a custom value in the Data Retention Period (in days) field. This value cannot be a decimal value and must be greater than zero.

By default, the upper limit for changing the data retention period is set to 14 days. This upper limit is available at Administration > System Settings. You can customize this value by modifying the following property value:

  • Property name: max.data.collector.data.retention.limit
  • Property location: %BMC_ITDA_HOME%\custom\conf\server\searchserviceCustomConfig.properties

After changing the property value, you need to restart the Search component to apply the change.

For more information about how data retention works, see Understanding data retention and deletion.

Search

In the search bar, at the top-right side of your screen, you can filter data collectors based on the following columns:

  • Name
  • Host
  • Data Pattern
  • Tags

To filter data collectors, you can specify one of the preceding names (except tags) either fully or partially. Each time you search, the data collectors are filtered based on values found in one or more of the preceding columns.

To filter data collectors by tags, you need to specify the tag name and value in the format, TagName=TagValue. You can also specify a comma-separated list of tag name=value pairs. 

Examples:

  • Suppose data collectors A and B are associated with the same data pattern, Apache Tomcat.
    Searching for Apache returns both the data collectors.
  • Suppose you have two data collectors – one named Access Log and the other named, Collector2. Both data collectors are associated with the data pattern, Apache Tomcat.
    Searching for Apache returns both the data collectors.
  • Suppose you have three data collectors with the OS tag. The tag values as follows:
    • Data collectors A and B contain the tag value Linux.
    • Data collector C contains the tag value Windows.
    Searching for OS=Linux returns both data collectors A and B.
    Searching for OS=Windows,Linux returns all the data collectors A, B, and C.

In addition, you can filter data collectors in the following ways:

  • Find an exact match for the specified search term: To do this, enclose the search term in double-quotes.
    Examples:
    • Suppose you have two data collectors – one associated to host, Host1 and the other associated to host Host123.
      Searching for Host1 returns both the data collectors.
      Searching for "Host1" only returns the data collector associated to Host1.
    • Suppose you have a data collector named, DC=Houston. This data collector contains an equals sign (=) in the name.
      Unless you have a tag named DC with the value Houston, searching for DC=Houston does not return any results. Searching for a name=value pair is treated as searching for a tag name=value pair.
      In this scenario, you can search for "DC=Houston" to find an exact match of the data collector name.
  • Search on the Name column only: To do this, specify the search term in the format, collectorName=<name>.
    Note: Providing a partial data collector name in this format returns all the data collectors matching that name.

The Data Collectors tab provides the following information:

Field

Description

Show Data Collected

Click Show Data Collected next to one of the data collectors in the last column on the right, to search the data collected by that data collector.

When you click Show Data Collected, by default the search is run for the search string, COLLECTOR_NAME="DataCollectorName".

In the preceding search string, DataCollectorName refers to the name of the data collector.

Name

Name of the data collector.

PathFile path of the data file used while creating the data collector.
HostHost name of the server on which the data exists.
Tags

List of tag names with their corresponding values added to the data collector.

Poll Status

Overall polling status for the data collector, as follows:

  • The green square indicates that polling was successful.
  • The yellow square indicates that few polls were unsuccessful.
  • The red square indicates successive unsuccessful polling. If your data collection fails more than four times consecutively, the status changes to red.
  • The square with no color indicates that the polling status is unavailable.

For more information, see Understanding the data collection status.

State

Displays the data collector state, to indicate whether the data collector was started or stopped, as follows:

  • If the data collector was started, then this column displays Started.
  • If the data collector was stopped, then this column displays Stopped.

Note: The start and stop actions are not relevant for the Upload file type of data collector as it performs one-time collection. Therefore, for this data collector, the State column displays a dash (-).

Date Modified

Date and time when the data collector was modified.

If you did not modify the data collector, then the date and time when the data collector was created is displayed.

Last Event TimestampDate and time of the last event that got indexed by the data collector.
TypeType of the data collector.
Data Pattern

Data pattern used for creating the data collector.

For more information about assigning a data pattern, see Assigning the data pattern and date format to a data collector.

Best practices for using common data collector settings

The following table lists the best practices that you can apply while using the common data collector settings (at the time of creating a data collector).

Setting

Best practice
Name

Use a consistent data-collector naming convention. Doing this can help you accomplish the following tasks:

  • Easily find your data collector on the Administration > Data Collectors table
  • Easily search for data from particular data collectors by using the collector name as a filter (COLLECTOR_NAME field)
Poll interval

Retain the default poll interval unless you have a special reason for changing it (such as running one of the script collectors only once every hour).  

Polling every 5 minutes (versus every 1 minute) does not make any noticeable difference in performance.

Filename/ rollover pattern

If the current log file being written to consistently uses the same name, provide the exact log file name.  

If the current log file name is consistently changing, use a match pattern (such as error.*.log). However, it is important to ensure that your pattern matches only the current log file, to avoid processing extra data that might not be intended.

Group access

If you are not enforcing access control on your data collectors, disable the data access control setting.  

Instead of selecting every access group for every data collector, it is more efficient to disable the data access control setting.

Time zone

You must not set a time zone explicitly for any log file that contains a time zone. If a time zone is not set explicitly and the log file does not contain time zone information, the IT Data Analytics server time zone is used when converting the date time to UTC.

Note: If the data you are collecting has a time stamp that is more than 24 hours in the future, that data is not indexed. Therefore, you must ensure that the time settings on the target hosts and the collection hosts are set up correctly and are synchronized.

If your data file does not have a reference to the year in the time stamp (as in Syslog files, for example), and at the time of indexing the product detects that the time at which the data occurred is ahead of the current time, the product assumes that this data is from the previous year. Such an instance might occur if the time settings for the target host and the collection host time are not synchronized. Based on the maximum data retention period (if set in days), such data might not be indexed.

Example: If the product server date and time are set to June 10, 2014 2:45 AM, and the events received have a date and time stamp of July 10, 3:45 AM, the product assumes that the year in which the data occurred is 2013 (the previous year). If the data retention period in the product is set to 15 days, this data is not indexed, because the time at which the data occurred is outside of the maximum data-retention period.

Scenarios when data is not collected

The following table provides scenarios in which data might not be collected and therefore is not searchable.

Scenario Description
When a data collector is stopped and then started.The time for which the data collector remains down.
When a new Collection Station is added to the pool.This involves restart of all the Collection Agents. Data is not be collected for the time taken by the Collection Agents to restart.
When the configuremasters CLI command is run.This involves the restart of the Collection Station. Data is not be collected for the time taken by the Collection Station to restart.
When the movecomponents CLI command is run.This involves the restart of all the Collection Agents. Data is not be collected for the time taken by the Collection Agents to restart.
When the data collector is started, past data is not collected.

This is applicable for all data collectors except Monitor using External Configuration and Monitor Remote Windows Events data collectors.

If the Payload channel is full.This can occur due to a number of reasons. For example, if the Collection Station is not reachable or there is a sudden burst in incoming data.