Managing data collectors


Data collectors are responsible for actually collecting your data, indexing it, and making it available for search. The Data Collectors tab allows you to configure data collectors for collecting data from particular data sources depending on the data pattern used.

What kind of data can I collect?

You can collect the following kinds of data:

  • Any kind of machine data such as logs and events from applications (including web servers, databases) and servers
  • Historical data and data generated continuously

Notes

  • You can collect data with UTF-8 character encoding only.
  • You cannot collect data that contains non-English characters appearing as the time stamp.

You can collect data for one-time or continuous monitoring.

When you create a data collector, at a minimum, you need to specify information about:

  • Your data source (for example, target server where the data is located and file location)
  • How you want to index the data (for example, data pattern to use)
  • How frequently you want to collect the data (for example, poll interval)

This information is used by the Indexer to index data and make it available in the form of events that can be searched immediately. If the manner in which the data was indexed is not as per your requirement, you can modify the data pattern and see if the results match your criteria.

Where is my data?

The data that you want to collect can be on the same computer on which the Collection Station (or Collection Agent) is installed (local data), or it can be on a different computer (remote data). You can collect data remotely by creating an SSH connection or connecting to a shared network drive on a Windows computer.

For more information, see Local and remote data collection.

Which Agent type should I use?

You can collect data by using one of the following collection mechanisms:

  • Collection Station—An entity that is automatically installed when you install the product and is responsible for actually collecting data and providing it to the Indexer for further processing.
  • Collection Agent—Another entity that can be used for collecting data, but for this you must configure the BMC PATROL Knowledge Module for IT Data Analytics.

The Collection Agent is useful in the following scenarios:

  • You already have the BMC PATROL components installed in your environment.
  • You have a company policy that restricts direct communication from the Collection Station to the target host. For example, if you cannot open up the target host's firewall ports, the Collection Station cannot communicate with the target server.

To understand how to choose a data collection mechanisms for your environment, see Agent-types.

For information about setting up the Collection Agent, see Setting up Collection Agents.

Data retention and deletion

After creating a data collector, data collection starts when the first poll happens. Data starts getting collected from the time when the first poll happens. Supposing you want to monitor a file in which data is being continuously added. After creating the data collector, data starts getting collected from the point when the first poll happened and the previous data available in the file is ignored. By default, the product defines the data retention period as seven days. This period defines the maximum duration of time for which data must be retained in the system. You can change the default setting by navigating to Administration > System Settings.

The data retention period acts as a moving window (depicted in green in the following figure).

Consider that on the following scale of time, you created a data collector at time T1, now data collection starts from T1 when the first poll happens. Data collected at T1 remains in the system until T1+7. As time passes, the data older than the seven days period starts getting deleted and is no longer available for searching.

data retention1.png

Data retention period has implications on the Read from Past (# days) function which defines the maximum limit (of time) for collecting data older than the current time. This setting is available for the following data collectors:

Note

After the data collector is created, it might take some time (approximately 1 minute) for the first poll to happen. The first poll is used to make the data collector ready for data collection. The data is fetched only from the second poll.

Expected time delay (to see the first set of data for a search) = (Time for first poll) + (Poll interval set for the data collector).

Kinds of data collectors

Depending on the data sources and whether you want to perform local or remote collection, data collectors can be categorized as follows:

Functions available while creating data collectors

  • Specify a rollover pattern for collecting rolling logs.
  • Read data from subdirectories of a parent directory.
  • Create a host containing details about the target and the collection host, and reuse this information while creating a data collector. For more information about creating hosts, see Managing-hosts.
  • Create a credential profile containing credentials to connect with the server where the data is located. You can reuse this credential profile while creating a data collectors for the Windows operating system. For more information about creating credential profiles, see Managing-credentials.
  • Specify group access permissions so that particular user groups can access and search the data coming from particular data sources.
  • Add tags that can later be used for effectively searching the data from particular data sources.
  • Filter the relevant data patterns (by using the filter icon.jpgFilter relevant data pattern icon available next to the Pattern field) to automatically detect the data patterns that match your data file.
  • Select a data pattern that you think might be most appropriate and use the preview option (by using the preview option.png Preview parsed log entries icon next to the Pattern field) to see how the parsed data records look. If the selected data pattern does not satisfy your needs, you can select another data pattern and again see a preview of the data records, until you are satisfied with the results.
  • If none of the filtered data patterns suit your needs, you can add a new data pattern.

Collecting product metrics

You can collect and analyze metrics (or logs) generated by the BMC TrueSight IT Data Analytics product for the Collection Station and Search components. After installing the product, the data collector for collecting the Collection Station is automatically created. You can also view a line chart summarizing this data over the last week in the top-right quadrant of the Search tab. But you need to create the data collector for collecting the Search component logs.

For more information, see Monitoring-the-product-metric-files.

Viewing and searching configured data collectors

The Data Collectors tab allows you to manage data collectors. To access this tab, navigate to Administration > Data Collectors.

This tab displays a default data collector for collecting the data in the Collection_metrics.log file. The Search tab displays a graph summarizing the data collected by this data collector. For more information, see Collecting product metrics.

You can perform the following actions on the Data Collectors tab.

The Data Collectors tab provides the following information:

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*