Setting up data collection
TrueSight IT Data Analytics helps you analyze data coming from various disparate data sources in a single console in a simple and meaningful way. Setting up data collection is the first step in the process of using TrueSight IT Data Analytics to search, analyze, and visualize data.
Before you start collecting data, you need to understand the kind of data that you want to collect, the source of data, the kind of Agent you want to use for collecting the data, and the method of data collection.
The following workflow describes the high-level process involved in collecting data:
(Click the boxes in the following figure to see more information about each of the steps.)
Identify the data that you want to collect
You can collect the following kinds of data:
- Any kind of machine data such as logs and events from applications (including web servers, databases) and servers
- Historical data and data generated continuously
You cannot collect data that contains non-English characters appearing as the time stamp.
You can collect data for one-time or continuous monitoring.
Determine the Agent type
When planning your data-collection configuration, be sure to choose the Agent types.(Collection Station or Collection Agent) that aligns with your existing environment or architecture. While creating a data collector, you need to select the collection host (computer where the agent resides). The Agent selected while creating a data collector is used for actually collecting the data. For more information, see
The user (super admin or app admin role) who installs the Collection Agent (or Collection Station) on the target host can collect data from all the files on the target host (including system files). You can restrict unauthorized access to files by changing operating system level file permissions.
The following resources can help you change permissions.
Decide the data collector configuration option
To be able to collect data, you need to create data collectors. Data collectors contain inputs that define when, how, and from where to collect data. Data collectors are run by Collection Stations or Collection Agents.
You can create data collectors in the following ways. You are likely to progress from one approach to another over time.
Individual data collector creation
You can create individual data collectors from the Administration > Data Collectors tab. This is the easiest and most common method of collecting data. If you need to create multiple data collectors of the same type, you can use the clone feature to create multiple data collectors. However, this might become cumbersome, in which case you can choose to use collection profiles for creating bulk data collectors.
For more information, see Collecting data into the system.
Bulk data collector creation with collection profiles
You can create collection profiles from the Administration > Collection Profiles tab. Collection profiles contain data-collector templates that help you automate your data-collector configuration.
This approach is useful when you have a host-centric view of your environment (for example, when using PATROL Agent with PATROL for IT Data Analytics). Suppose that in your environment you have seven Linux hosts. Two of the Linux hosts have the JBOSS application installed on them, while the remaining five hosts have the Apache Tomcat application installed on them. Now you can create collection profiles with data-collector templates in the following ways:
Create a collection profile for the JBOSS application and apply it to the appropriate Linux hosts.
Create a collection profile for Apache Tomcat application and apply it to the appropriate Linux hosts.
After the collection profile is associated to a host, you can expect data collectors to be automatically created and be ready to start data collection.
You can associate a collection profile with a collection host in one of the following ways:
By editing an existing host or while creating a new host from the Administration > Hosts page.
By running the applycollectionprofilestohost CLI command.
For more information about using collection profiles, see Creating multiple data collectors with collection profiles.
Bulk data collector creation with CLIs
Identify the data collector type
The following table lists data collectors categorized by the data source and based on whether the data collector is meant for local or remote data collection. For example, if you want to collect data from files and directories locally, you need to create the Monitor file on Collection Agent type of data collector.
You might want to experiment with the various data collectors to determine what works best for your environment.
|Goal||Local / remote?||Data collector type|
Collect data that comes from various files and directories.
Note: The Upload File type of data collector can be used to upload a file for one-time collection of data.
|Collect data that is generated as a result of running a script.||Local|
|Collect and index Windows events remotely.||Remote|
|Collect and index Windows events locally.||Local|
Collect events directly from supported external systems such as ProactiveNet, or TrueSight Infrastructure Management.
|Collect Syslog events over a TCP or UDP connection.||Remote|
|Collect data over an HTTP or HTTPS connection.||Remote|
Identify a data pattern to extract fields
This is an optional step, however it is a recommended step.
Adding fields during data-pattern creation can make search more effective. Fields are name=value pairings that add meaning to the data. At the time of data collection, the product automatically extracts particular knowledge from the data (such as timestamp and name=value pairs already present in the data). But bulk of the field extraction happens depending on the data pattern used for collecting the data. Fields help you classify and extract important portions in your data that might otherwise go unnoticed. For more information, see About field extraction.
While creating a data collector, you need to assign a data pattern (and optionally date format) matching the data that you want to collect. Based on the data pattern, the data collector collects data, extracts, and makes the data available as a series of individual records, on which you can search.
To be able to assign a data pattern, you need to ensure that a data pattern matching the data to be collected already exists. TrueSight IT Data Analytics provides a list of default data patterns that you can directly use while creating a data collector. Therefore, at most times you might not need to create a data pattern. If you do not find a data pattern that suits your needs, at a minimum, you can choose to extract the timestamp and index the remaining data as free text. However, if you need a richer classification of fields, then you need to create a new data pattern or customize an existing data pattern that closely matches the data to be collected. For more information, see Setting up data patterns to extract fields
This step might have to be revisited periodically as your usage patterns become more clear. When you start collecting data for the first time, you might not be able to identify all the fields that can be helpful while searching the data. But as you analyze the data more and more, it is possible that you want to extract additional fields. To extract additional fields, you might need to edit the existing data pattern or create a new one based on your needs.
Create hosts to simplify management of data collectors
Creatingin your system first and then creating data collectors associated with the host objects allows key properties to be inherited by the data collectors assigned to that host. The following properties of a host can be inherited by the data collector assigned to it:
- Host name
- Host-level tags
- Host-level access groups
Using hosts ensures consistency and avoids instances in which you accidentally forget to set up the same property for each data collector.
Using collection profiles is another way of leveraging host objects. One or more collection profiles can be applied against a host object. Data collectors for all data-collector templates (contained in the collection profiles) are created for each host. For example, in the following table, you can see that two collection profiles are applied to a single host, and data collectors are created for each host.
|Collection Profile 1||Collection Profile 2||Host 1||Data collectors created|
Data Collector Template 1 (T1)
Data Collector Template 2 (T2)
Data Collector Template 3 (T3)
This approach can be useful when you are using PATROL for IT Data Analytics with a PATROL Agent associated with each host.
For more information, see Setting up host profiles for target computers.
Create credentials to store user names and passwords
For any data collectors that require username and password credentials, it is recommended that you create stored credential objects that can be referenced by the data collector.
Using credentials can be useful in the following scenarios:
- You plan to use the CLI import/export feature: When you export data collectors by using the CLI command, the passwords are not saved as a part of the exported file. If you use a stored credential in the data collector instead of manually providing details, when you export the data collector, you do not need to make manual password changes before actually importing the data collector into the system.
- Your applications require periodic password change: If you have a company policy that requires a periodic password change for applications, then by editing the stored credentials you can apply the password change to all data collectors referencing that credential.
For more information, see Setting up credentials to access the target computers.
Define tags and user groups
This step is an optional step, however, a recommended step especially if you want to be able to search more effectively and if you want to restrict access to the data collected.
Setting tags properly enables you to filter or isolate search intuitively. Identification of tags can be a periodical step as your usage patterns become more clear. When you start collecting data for the first time, you might not be able to identify the tags that can be assigned to a particular data collector. But as you search and analyze the data collected, it is possible that you feel the need for defining tags.
Tags allow administrator to associate a set of properties with each data collector and the data it collects. Tags that are set properly can be very useful in the search process. Tags allow you to filter or isolate search intuitively. Data collector tags must represent data properties that are clearly defined across all data sources; for example, location, application group, and OS are common tag names that can be applied across most data sources. Tags must be added only if they might be useful in search filtering. Tags have some performance overhead associated with them, so you must think through a clear tag convention ahead of time and only define those tags that will be used.
User groups are important if you want to restrict access to the collected data to particular users only. While creating a data collector, you can select the user groups to which you want to provide access. For more information, see Managing user groups in IT Data Analytics.
Create data collectors to start data collection
Based on the data collector configuration option selected earlier, navigate to one of the following topics:
|Data collection configuration option||Resources|
|Create data collectors via the Administration > Data Collectors tab||Collecting data into the system|
|Create collection profiles via the Administration > Collection Profiles tab||Creating multiple data collectors with collection profiles|
|Use CLI commands to export and import data collectors||exportcollector CLI command|
|importcollector CLI command|