You can create a data collector to collect logs from a particular host.
Note
This data collector does not work for mapped drives.
The following information describes the instructions for creating a data collector for getting files into IT Data Analytics:
The following video (4:09) illustrates the process of creating a data collector for collecting the itda.log file. https://youtu.be/vB7StE8H-gM
In the Name box, provide a unique name to identify this data collector.From the Type list, select Monitor File on Collection Agent.
Provide the following information, as appropriate:
Field | Description |
---|---|
Target/Collection Host | |
Collection Host (Agent) | Type or select the collection host depending on whether you want to use the Collection Station or the Collection Agent to perform data collection. The collection host is the computer on which the Collection Station or the Collection Agent is located. By default, the Collection Station is already selected. You can either retain the default selection or select the Collection Agent. Note: For this type of data collector, the target host and collection host are expected to have the same values. |
Collector Inputs | |
Directory Path | Specify a directory path that is an absolute path of the log file. In the path, you can specify wildcards or system environment variables. Wildcards can be used to match a partial path or include subdirectories of a file. You can use the following wildcard characters:
For more information, see Using wildcards in the directory path. Tip: (about specifying an environment variable) Keep in mind that after creating the environment variable on the Collection Host, you need to restart the Collection Agent (or Collection Station) to be used for creating the data collector. Without doing this, you cannot apply the environment variable and this might affect the auto-detect feature available for assigning a data pattern. |
Filename/Rollover Pattern | Specify the file name only, or specify the file name with a rollover pattern to identify subsequent logs. You can use the following wildcard characters:
Specifying a rollover pattern can be useful to monitor rolling log files where the log files are saved with the same name but differentiated with some variable like the time stamp or a number. Specifying a wildcard can also be useful when you remember the file name only partially. Note: Ensure that you specify a rollover pattern for identifying log files that follow the same data format (which means they will be indexed with the same data pattern). |
Time Zone | (Optional) Accept the default Use file time zone option or select a time zone from the list. With the default option, data is indexed as per the time zone available in the data file. If the data file does not contain a timezone, then the time zone of the Collection Host (Collection Station or Collection Agent server) is used. Keep in mind that the selected timezone must match the timezone of the server from which you want to collect data. If you manually specify the timezone despite the file containing a timezone, then the manually specified timezone overrides the file timezone. |
Data Pattern | |
Pattern | Assign the data pattern (and optionally date format) for indexing the data file. The data pattern and date format together decide the way in which the data will be indexed. When you select a data pattern, the matching date format is automatically selected. However, you can override the date format by manually selecting another date format or by selecting the option to create a new date format. By doing this, the date format is used to index the date and time string, while rest of the data is indexed as per the data pattern selected. Instead of manually browsing through the list of available data patterns, you can click Auto-Detect to automatically find a list of matching data patterns. If no matching data patterns are found, then a list of matching date formats is displayed. By selecting the date format, the date and time string (in the data) is indexed with the selected date format, while rest of the data is indexed as free text. If you cannot find both matching data patterns and date formats, then you can choose to index the data as free text. Depending on whether the data contains a date and time string, you can choose to assign the data pattern as Free Text with Timestamp or Free Text without Timestamp. All the records processed by using the Free Text without Timestamp option are assumed to be a single line of data with a line terminator at the end of the event. To distinguish records in a custom way, you can specify a custom string or regular expression in the Event Delimiter box, which decides where the new line starts in the data. If you are collecting JSON data, then depending on whether the data contains a date and time string, you can assign the data pattern as JSON with Timestamp or JSON without Timestamp. After assigning the data pattern (and optionally date format), you can preview the sample records. For more information, see Assigning the data pattern and date format. Notes:
|
Date Format | |
Date Locale | (Optional) You can use this setting to enable reading the date and time string based on the language selected. Note that this setting only applies to those portions of the date and time string that consist letters (digits are not considered). By default, this value is set to English. You can manually select a language to override the default locale. For a list of languages supported, see Language information. |
File Encoding | If your data file uses a character set encoding other than UTF-8 (default), then do one of the following:
|
Poll Interval (mins) | Enter a number to specify the poll interval (in minutes) for the log collection. By default, this value is set to 1. |
Start/Stop Collection | (Optional) Select this check box if you want to start the data collection immediately. |
A wildcard is a character that can be used to substitute one or more characters while selecting files for monitoring.
Using wildcards in the directory path can be useful in the following scenarios:
Tip
Directory paths of Linux systems are case sensitive.
The following table lists the wildcards that you can use while specifying directory paths:
Wildcard | Can be used to... | Examples |
---|---|---|
* | Substitute zero or more characters in the directory path. | /app/subapp*/log/access_log/ matches the following paths:
|
? | Substitute exactly one character in the directory path. | /app/subapp?/log/access_log/ matches the following paths:
|
/app/subapp??/log/ matches the following paths:
| ||
** | Match a partial path or include subdirectories of the directory path depending on where you place the wildcard in the path. To collect data from subdirectories, you need to specify the ** wildcard sequence at the end of the directory path. Note: This wildcard searches through directories and subdirectories at a maximum of five levels to find matches. Best practice: If you use this wildcard in place of extremely deep level of directories then it can negatively impact performance. Therefore, it is recommended that you use this wildcard in appropriate places. For example, suppose you want to collect the itda.log. To do this, you can specify the following inputs:
When you specify the wildcard towards the beginning of the directory path, the search for directories happens at a deeper level and doing this can negatively impact performance. Conversely, when you specify the wildcard towards the end of the directory path, the search for directories happens on a limited set and doing this can improve performance. Thus, in this scenario, specifying C:/Program files/bmcsoftware/**/ is better than specifying C:/**/. Note: If you are using a Collection Agent earlier than version 2.5, then you can only specify this wildcard at the end of the directory path to include subdirectories. For example, you can specify /usr/local/**/ to collect the following logs:
| /usr/**/*_log matches the following paths:
|