Collecting data from an individual file

You can collect individual log files (up to 100 MB in size) from the computer on which you are accessing the product Console. To collect data from an individual file, you need to create the Upload file data collector.

Note

BMC recommends that you not use this data collector for files that are updated dynamically.

Only that data which is present in the file at the time of uploading is collected. Any additional data added to the file, after the file is already uploaded into the system, does not get collected. Uploading images, compressed files and application files are not supported when you create the Upload file data collector.

The following information describes the data collection process:

- To collect data from an individual file
- Changing the default limit for uploading data

To collect data from an individual file

Navigate to Administration > Data Collectors > Add Data Collector .
In the Name box, provide a unique name to identify this data collector.
From the Type list, select Upload file.

Provide the following information, as appropriate:

Field	Description
Target/Collection Host
Target Host	(Optional) Select from a list of hosts that you have already configured under Administration > Hosts. The target host is the computer from which you want to retrieve the data. You can choose to select the target host and inherit the host-level tags and group access permissions already added to the host, or manually enter the host name in the Server Name field.
Collection Host (Agent)	Type or select the collection host depending on whether you want to use the Collection Station or the Collection Agent to perform data collection. The collection host is the computer on which the Collection Station or the Collection Agent is located. By default, the Collection Station is already selected. You can either retain the default selection or select the Collection Agent. Note: For this type of data collector, the target host and collection host value is the same.
Collector Inputs
Server Name	Enter the host name of the server from which you want to retrieve the data. Note: If you selected a target host earlier, this field is automatically populated. The value of this field is necessary for generating the "HOST" field that enables effective data search. This field is mandatory to enable you to search data that you are uploading by host name.
File Path	Provide the path of the log file.
Time Zone	(Optional) Accept the default Use file time zone option or select a time zone from the list. With the default option, data is indexed as per the time zone available in the data file. If the data file does not contain a timezone, then the time zone of the Collection Host (Collection Station or Collection Agent server) is used. Keep in mind that the selected timezone must match the timezone of the server from which you want to collect data. If you manually specify the timezone despite the file containing a timezone, then the manually specified timezone overrides the file timezone. The field Time Zone takes into account the changes due to Daylight Savings Time (DST) where ever applicable.
Data Pattern
Pattern	Assign the data pattern (and optionally date format) for indexing the data file. The data pattern and date format together decide the way in which the data will be indexed. When you select a data pattern, the matching date format is automatically selected. However, you can override the date format by manually selecting another date format or by selecting the option to create a new date format. By doing this, the date format is used to index the date and time string, while rest of the data is indexed as per the data pattern selected. Instead of manually browsing through the list of available data patterns, you can click Auto-Detect to automatically find a list of matching data patterns. If no matching data patterns are found, then a list of matching date formats is displayed. By selecting the date format, the date and time string (in the data) is indexed with the selected date format, while rest of the data is indexed as free text. If you cannot find both matching data patterns and date formats, then you can choose to index the data as free text. Depending on whether the data contains a date and time string, you can choose to assign the data pattern as Free Text with Timestamp or Free Text without Timestamp. All the records processed by using the Free Text without Timestamp option are assumed to be a single line of data with a line terminator at the end of the event. To distinguish records in a custom way, you can specify a custom string or regular expression in the Event Delimiter box, which decides where the new line starts in the data. If you are collecting JSON data, then depending on whether the data contains a date and time string, you can assign the data pattern as JSON with Timestamp or JSON without Timestamp. After assigning the data pattern (and optionally date format), you can preview the sample records. For more information, see Assigning-the-data-pattern-and-date-format-to-a-data-collector. Notes: Before filtering the relevant data patterns by clicking Auto-Detect, ensure that the correct file encoding is set. If you select both – a pattern and a date format, the product uses the date format to index the timestamp and the pattern to index rest of the event data.
Date Format
Date Locale	(Optional) You can use this setting to enable reading the date and time string based on the language selected. Note that this setting only applies to those portions of the date and time string that consist letters (digits are not considered). By default, this value is set to English. You can manually select a language to override the default locale. For a list of languages supported, see Language information for IT Data Analytics
File Encoding	If your data file uses a character set encoding other than UTF-8 (default), then do one of the following: Filter the relevant character set encodings that match the file. To do this, click Filter relevant charset encoding next to this field. Manually scan through the list available and select an appropriate option. Allow TrueSight IT Data Analytics to use a relevant character set encoding for your file by manually select the AUTO option.
Read from Past (#days)	Indicates the number of days for which past data can be collected and indexed. The maximum amount of past data that can be collected into the system is defined by the data retention period. You can edit this value only at the time of creation of a data collector. It cannot be more than the retention period associated with the Index Block that you set from Advanced Options> Index Block. Recommendation: BMC recommends you to not use a very high value in this field (for example, 365). This is necessary to avoid a very large amount of data collected into the system in a short time.

Advanced Options

Ignore Data Matching Input	(Optional) If you do not want to index certain lines in your data file, then you can ignore them by providing one of the following inputs: Provide a line that consistently occurs in the event data that you want to ignore. This line will be used as the criterion to ignore data during indexing. Provide a Java regular expression that will be used as the criterion for ignoring data matching the regular expression. Example: While using the following sample data, you can provide the following input to ignore particular lines. To ignore the line containing the string, "WARN", you can specify WARN in this field. To ignore lines containing the words both "WARN" and "INFO", you can specify a regular expression .(WARN\|INFO). in this field. Sample data Sep 25, 2014 10:26:47 AM net.sf.ehcache.config. ConfigurationFactory parseConfiguration():134 WARN: No configuration found. Configuring ehcache from ehcache-failsafe.xml found in the classpath: Sep 25, 2014 10:26:53 AM com.bmc.ola.metadataserver. MetadataServerHibernateImpl bootstrap():550 INFO: Executing Query to check init property: select * from CONFIGURATIONS where userName = 'admin' and propertyName ='init' Sep 30, 2014 07:03:06 PM org.hibernate.engine.jdbc.spi. SqlExceptionHelper logExceptions():144 ERROR: An SQLException was provoked by the following failure: java.lang.InterruptedException Sep 30, 2014 04:39:27 PM com.bmc.ola.engine.query. ElasticSearchClient indexCleanupOperations():206 INFO: IndexOptimizeTask: index: bw-2014-09-23-18-006 optimized of type: data
Index Block	Indicates the index block with which you want to associate the data collector. You can associate a data collector to one of the various index blocks, each having a configurable retention period. By default, this value is set to Small. The maximum number of index blocks allowed are 5. Besides the three defined index blocks, Small, Medium and Large, you can create two more custom index blocks. When you select an index block, the properties of that index block are displayed below it. The properties that are displayed are: Archive: This indicates whether the data that you index using the selected index block will be archived. Retention Days: This indicates the retention days associated with the index block. Following are the retention days associated with the typical index blocks. The retention days displayed can be as configured by your Administrator. Select the index block as per your needs of retention days and the Archive status. If the Archive status is Off and you need to archive your data, contact your administrator to set the Archive status for the index block to On. For more information on how to set the archive status of the index block, see Changing System Settings. Note If you select the ITDA Metrics data pattern while creating a data collector, the Index Block field is unavailable since the Metrics Index Block is automatically associated with the data collector.
Best Effort Collection	(Optional) If you clear this check box, only those lines that match the data pattern are indexed; all other data is ignored. To index the non-matching lines in your data file, keep this check box selected. Note: Non-matching lines in the data file are indexed on the basis of the Free Text with Timestamp data pattern. Example: The following lines provide sample data that you can index by using the Hadoop data pattern. In this scenario, if you select this check box, all lines are indexed. But if you clear the check box, only the first two lines are indexed. Sample data 2014-08-08 15:15:43,777 INFO org.apache.hadoop.hdfs.server. datanode.DataNode.clienttrace: src: /10.20.35.35:35983, dest: /10.20.35.30:50010, bytes: 991612, op: HDFS_WRITE, cliID: 2014-08-08 15:15:44,053 INFO org.apache.hadoop.hdfs.server. datanode.DataNode: Receiving block blk_-6260132620401037548_ 683435 src: /10.20.35.35:35983 dest: /10.20.35.30:50010 2014-08-08 15:15:49,992 IDFSClient_-19587029, offset: 0, srvID: DS-731595843-10.20.35.30-50010-1344428145675, blockid: blk_-8867275036873170670_683436, duration: 5972783 2014-08-08 15:15:50,992 IDFSClient_-19587029, offset: 0, srvID: DS-731595843-10.20.35.30-50010-1344428145675, blockid: blk_-8867275036873170670_683436, duration: 5972783

Tags

Inherit Host Level Tags From Target Host

(Optional) Select this check box to inherit your tag selections associated with the target host that you selected earlier. This option is not applicable if you did not select a target host. Note: After selecting this check box, you can further manually select additional user groups. When you manually select additional user groups, both the inherited permissions as well as the manually assigned permissions are applied. To remove the inherited permissions, clear this check box.

Select Tag name and corresponding value

(Optional) Select a tag name and specify the corresponding value by which you want to categorize the data collected. Later while searching data, you can use these tags to narrow down your search results.

Example: If your are collecting data from hosts located at Houston, you can select a tag name for "Location" and in the value specify "Houston". While searching the data, you can use the tag, Location="Houston" to filter data and see results associated with the Houston location.

To be able to see tag names, you need to first add them by navigating to Administration > System Settings.

To specify tag names and corresponding values, in the left box select a tag name and then type the corresponding tag value in the right box. While you type the value, you might see type-ahead suggestions based on values specified in the past. If you want to use one of the suggestions, click the suggestion. Click Add to add the tag name and corresponding value to the list of added tags that follow. Click Remove Tag to remove a tag.

The tags saved while creating the data collector are displayed on the Search tab, under the Filters panel, and in the Tags section.

Note: At a time, you can specify only one value for a tag name. To specify multiple values for the same tag name, each time you need to select the tag name, specify the corresponding value, and click Add.

For more information about tags, see Understanding-tags.

Group Access

Inherit Host Level Access Groups From Target Host

(Optional) Select this check box to inherit your group access configurations associated with the target host that you selected earlier. This option is not applicable if you did not select a target host.

Note: After selecting this check box, you can further manually select additional user groups. When you manually select additional user groups, both the inherited permissions as well as the manually assigned permissions are applied. To remove the inherited permissions, clear this check box.

Select All Groups

(Optional) Select this option if you want to select all user groups. You can also manually select multiple user groups.

Notes: You can access data retrieved by this data collector based on the following conditions.

If user groups are not selected and data access control is enabled: Only the creator of the data collector can access data retrieved by this data collector.
If user groups are not selected and if data access control is not enabled: All users can access data retrieved by this data collector. You can restrict access permissions by selecting the relevant user groups that must be given access permissions. To enable data access control, navigate to Administration > System Settings.

For more information, see Managing-user-groups-in-IT-Data-Analytics.

Click Create to save your changes.

Changing the default limit for uploading data

By default, the limit for uploading a file is set to 100 MB. You can change this limit by adding the collection.upload.maximumAllowedFileSizeInMB property in the olaengineCustomConfig.properties file located at %BMC_ITDA_HOME%\custom\conf\server. The value of this property is the size of data (in MB) that you want to collect.

After adding the property, save the file, and then restart the service for the Console Server. For more information, see Starting-or-stopping-product-services.

Recommendation

BMC recommends that you upload files less than 100 MB in size. A file size larger than 100 MB may lead to the data collector timing out.

Collecting data from an individual file

To collect data from an individual file

Changing the default limit for uploading data

On this page