Setting up data collection

TrueSight IT Data Analytics helps you analyze data coming from various disparate data sources in a single console in a simple and meaningful way. Setting up data collection is the first step in the process of using TrueSight IT Data Analytics to search, analyze, and visualize data.

Before you start collecting data, you need to understand the kind of data that you want to collect, the source of data, the kind of Agent you want to use for collecting the data, and the method of data collection.

Related topics

Using

Search strategies

Integrating

Troubleshooting common issues

The following workflow describes the high-level process involved in collecting data:

(Click the boxes in the following figure to see more information about each of the steps.)

data collection process gliffy

Standalone Agent and Standalone Collection Agent

All references to the Standalone Agent or Standalone Collection Agent in this document is applicable only if you are using IT Data Analytics version 11.3.01. The latest version released for a Standalone Agent is 11.3.01. Starting from version 11.3.02, no more versions will be released for the Standalone Agent. However, you can make a note of the following information:

You can continue to use Standalone Agent version 11.3.01 with IT Data Analytics version 11.3.02.
If you have created Data Collectors using a Standalone Agent in version 11.3.01, the data collection will continue to work with IT Data Analytics version 11.3.02.
You can also edit the Data Collectors to use PATROL Agent instead of a Standalone Agent in IT Data Analytics version 11.3.02.

Identify the data that you want to collect

You can collect the following kinds of data:

Any kind of machine data such as logs and events from applications (including web servers, databases) and servers
Historical data and data generated continuously

Note

You cannot collect data that contains non-English characters appearing as the time stamp.

You can collect data for one-time or continuous monitoring.

Back to top

Determine the Agent type

When planning your data-collection configuration, be sure to choose the Agent type (Collection Station or Collection Agent) that aligns with your existing environment or architecture. While creating a data collector, you need to select the collection host (computer where the agent resides). The Agent selected while creating a data collector is used for actually collecting the data. For more information, see Agent types.

After you choose the Agent type, you need to setup the Agents so that you can use it later while creating data collectors. For more information, see the following topics:

To understand how to set up Collection Agents, see Setting up Collection Agents.
To understand how to set up additional Collection Stations, see Installing TrueSight IT Data Analytics in a multiple server environment .

Note

The user (super admin or app admin role) who installs the Collection Agent (or Collection Station) on the target host can collect data from all the files on the target host (including system files). You can restrict unauthorized access to files by changing operating system level file permissions.

The following resources can help you change permissions.

Windows: Security and protection in Windows Server 2008
Linux: About file permissions

Back to top

Decide the data collector configuration option

To be able to collect data, you need to create data collectors. Data collectors contain inputs that define when, how, and from where to collect data. Data collectors are run by Collection Stations or Collection Agents.

You can create data collectors in the following ways. You are likely to progress from one approach to another over time.

Individual data collector creation
Bulk data collector creation with collection profiles
Bulk data collector creation with CLIs

Individual data collector creation

You can create individual data collectors from the Administration > Data Collectors tab. This is the easiest and most common method of collecting data. If you need to create multiple data collectors of the same type, you can use the clone feature to create multiple data collectors. However, this might become cumbersome, in which case you can choose to use collection profiles for creating bulk data collectors.

For more information, see Collecting data into the system.

Bulk data collector creation with collection profiles

You can create collection profiles from the Administration > Collection Profiles tab. Collection profiles contain data-collector templates that help you automate your data-collector configuration.

Collection profiles allow you to save multiple data-collector configuration settings (typically associated with a host) to apply against other hosts simultaneously. This approach promotes consistency and is more efficient when compared to individual data-collector configuration.

This approach is useful when you have a host-centric view of your environment (for example, when using PATROL Agent with PATROL for IT Data Analytics). Suppose that in your environment you have seven Linux hosts. Two of the Linux hosts have the JBOSS application installed on them, while the remaining five hosts have the Apache Tomcat application installed on them. Now you can create collection profiles with data-collector templates in the following ways:

Create a collection profile for the JBOSS application and apply it to the appropriate Linux hosts.
Create a collection profile for Apache Tomcat application and apply it to the appropriate Linux hosts.

After the collection profile is associated to a host, you can expect data collectors to be automatically created and be ready to start data collection.

You can associate a collection profile with a collection host in one of the following ways:

By editing an existing host or while creating a new host from the Administration > Hosts page.
By running the applycollectionprofilestohost CLI command.

For more information about using collection profiles, see Creating multiple data collectors with collection profiles.

Bulk data collector creation with CLIs

The product command line interface supports export and import of data-collector configurations.

Export a small set of data collectors and then use that base export file as a master copy. Make changes in the exported copy and then import the copy into the product.

This approach can be an efficient and consistent method of configuring data collectors. It ensures that you have a backup copy of your configuration settings in files outside of the product (in the event of a serious failure). It also allows you to automate data-collector configuration, because the command line can be triggered from other scripts or workflows.

Back to top

Identify the data collector type

The following table lists data collectors categorized by the data source and based on whether the data collector is meant for local or remote data collection. For example, if you want to collect data from files and directories locally, you need to create the Monitor file on Collection Agent type of data collector.

You might want to experiment with the various data collectors to determine what works best for your environment.

Goal	Local / remote?	Data collector type
Collect data that comes from various files and directories. Note: The Upload File type of data collector can be used to upload a file for one-time collection of data.	Local	Monitor file on Collection Agent
	Remote	Monitor File over SSH
	Remote	Monitor over Windows Share
	Remote	Upload file
Collect data that is generated as a result of running a script.	Local	Monitor Script Output on Collection Agent
	Remote	Monitor script output over SSH
Collect and index Windows events remotely.	Remote	Monitor Remote Windows Events
Collect and index Windows events locally.	Local	Monitor Local Windows Events
Collect events directly from supported external systems such as ProactiveNet, or TrueSight Infrastructure Management.	Remote	Monitor using External Configuration
Collect Syslog events over a TCP or UDP connection.	Remote	Receive over TCP/UDP
Collect data over an HTTP or HTTPS connection.	Remote	Receive over HTTP/HTTPS

Back to top

Identify a data pattern to extract fields

This is an optional step, however it is a recommended step.

Adding fields during data-pattern creation can make search more effective. Fields are name=value pairings that add meaning to the data. At the time of data collection, the product automatically extracts particular knowledge from the data (such as timestamp and name=value pairs already present in the data). But bulk of the field extraction happens depending on the data pattern used for collecting the data. Fields help you classify and extract important portions in your data that might otherwise go unnoticed. For more information, see About field extraction.

While creating a data collector, you need to assign a data pattern (and optionally date format) matching the data that you want to collect. Based on the data pattern, the data collector collects data, extracts fields, and makes the data available as a series of individual records, on which you can search.

To be able to assign a data pattern, you need to ensure that a data pattern matching the data to be collected already exists. TrueSight IT Data Analytics provides a list of default data patterns that you can directly use while creating a data collector. Therefore, at most times you might not need to create a data pattern. If you do not find a data pattern that suits your needs, at a minimum, you can choose to extract the timestamp and index the remaining data as free text. However, if you need a richer classification of fields, then you need to create a new data pattern or customize an existing data pattern that closely matches the data to be collected. For more information, see Setting up data patterns to extract fields

Note

This step might have to be revisited periodically as your usage patterns become more clear. When you start collecting data for the first time, you might not be able to identify all the fields that can be helpful while searching the data. But as you analyze the data more and more, it is possible that you want to extract additional fields. To extract additional fields, you might need to edit the existing data pattern or create a new one based on your needs.

Back to top

Create hosts to simplify management of data collectors

Creating hosts in your system first and then creating data collectors associated with the host objects allows key properties to be inherited by the data collectors assigned to that host. The following properties of a host can be inherited by the data collector assigned to it:

Host name
Host-level tags
Host-level access groups

Using hosts ensures consistency and avoids instances in which you accidentally forget to set up the same property for each data collector.

Using collection profiles is another way of leveraging host objects. One or more collection profiles can be applied against a host object. Data collectors for all data-collector templates (contained in the collection profiles) are created for each host. For example, in the following table, you can see that two collection profiles are applied to a single host, and data collectors are created for each host.

Collection Profile 1

Collection Profile 2

Host 1

Data collectors created

Data Collector Template 1 (T1)

Data Collector Template 2 (T2)

Data Collector Template 3 (T3)

H1

T1H

T2H

T3H

This approach can be useful when you are using PATROL for IT Data Analytics with a PATROL Agent associated with each host.

For more information, see Setting up host profiles for target computers.

Back to top

Create credentials to store user names and passwords

For any data collectors that require username and password credentials, it is recommended that you create stored credential objects that can be referenced by the data collector.

Using credentials can be useful in the following scenarios:

You plan to use the CLI import/export feature: When you export data collectors by using the CLI command, the passwords are not saved as a part of the exported file. If you use a stored credential in the data collector instead of manually providing details, when you export the data collector, you do not need to make manual password changes before actually importing the data collector into the system.
Your applications require periodic password change: If you have a company policy that requires a periodic password change for applications, then by editing the stored credentials you can apply the password change to all data collectors referencing that credential.

For more information, see Setting up credentials to access the target computers.

Back to top

Define tags and user groups

This step is an optional step, however, a recommended step especially if you want to be able to search more effectively and if you want to restrict access to the data collected.

Setting tags properly enables you to filter or isolate search intuitively. Identification of tags can be a periodical step as your usage patterns become more clear. When you start collecting data for the first time, you might not be able to identify the tags that can be assigned to a particular data collector. But as you search and analyze the data collected, it is possible that you feel the need for defining tags.

Tags allow administrator to associate a set of properties with each data collector and the data it collects. Tags that are set properly can be very useful in the search process. Tags allow you to filter or isolate search intuitively. Data collector tags must represent data properties that are clearly defined across all data sources; for example, location, application group, and OS are common tag names that can be applied across most data sources. Tags must be added only if they might be useful in search filtering. Tags have some performance overhead associated with them, so you must think through a clear tag convention ahead of time and only define those tags that will be used.

User groups are important if you want to restrict access to the collected data to particular users only. While creating a data collector, you can select the user groups to which you want to provide access. For more information, see Managing user groups in IT Data Analytics.

Note that while creating a data collector, by selecting the target host, you can also inherit tags and user groups associated with the host.

For more information about assigning tags and user groups to a data collector, see Creating data collectors.

Back to top

Create data collectors to start data collection

Based on the data collector configuration option selected earlier, navigate to one of the following topics:

Data collection configuration option	Resources
Create data collectors via the Administration > Data Collectors tab	Collecting data into the system
Create collection profiles via the Administration > Collection Profiles tab	Creating multiple data collectors with collection profiles
Use CLI commands to export and import data collectors	exportcollector CLI command
Use CLI commands to export and import data collectors	importcollector CLI command

Back to top

Setting up data collection

Identify the data that you want to collect

Determine the Agent type

Decide the data collector configuration option

Individual data collector creation

Bulk data collector creation with collection profiles

Bulk data collector creation with CLIs

Identify the data collector type

Identify a data pattern to extract fields

Create hosts to simplify management of data collectors

Create credentials to store user names and passwords

Define tags and user groups

Create data collectors to start data collection

Comments