Page tree

You can create a data collector for monitoring data by using an SSH connection to a Microsoft Windows or Linux computer and retrieving event data.

This topic contains the following information:

Related topics

To collect data by using an SSH connection

  1. Navigate to Administration > Data Collectors > Add Data Collector .

  2. In the Name box, provide a unique name to identify this data collector.

  3. From the Type list, select Monitor File over SSH.

  4. Provide the following information, as appropriate:

     

    FieldDescription
    Target/Collection Host
    Target Host

    (Optional) Select from a list of hosts that you have already configured under Administration > Hosts.

    The target host is the computer from which you want to retrieve the data. You can choose to select the target host and inherit the host-level tags and group access permissions already added to the host, or manually enter the host name in the Server Name field.

    Collection Host (Agent)

    Type or select the collection host depending on whether you want to use the Collection Station or the Collection Agent to perform data collection.

    The collection host is the computer on which the Collection Station or the Collection Agent is located.

    By default, the Collection Station is already selected. You can either retain the default selection or select the Collection Agent.

    Note: For this type of data collector, the target host and collection host are expected to have different values.

    Collector Inputs
    Server Name

    Enter the host name of the server from which you want to retrieve the data.

    Note: If you selected a target host earlier, this field is automatically populated. The value of this field is necessary for generating the "HOST" field that enables effective data search.

    Credentials

    (Optional) Select one of the following options:

    • Apply security credential to automatically populate the user name and password fields.
      Then select the appropriate credential (profile) from the Available Credential list that you already configured under Administration > Credentials.
    • Provide Credential to manually add user name and password credentials.
      Then enter the credentials in the User Name and Password fields.
      You can also create a credential that uses the manually entered details by clicking Add Credential  next to the Password field.
    User Name

    Provide the user name for connecting with the server from which you want to retrieve the data.

    Note: This field is disabled if you applied a security profile earlier.

    The product supports only password-based authentication for connecting with the SSH server.

    Password

    Provide the password for connecting with the server from which you want to retrieve the data.

    Click Add Credential , provide a credential name, and click OK to create a new credential (profile) from the credentials that you provided in the user name and password fields. Once this credential is created, it is displayed under Administration > Credentials.

    Note: This field is disabled if you applied a security credential earlier.

    Directory Path

    Provide the absolute path of the data file.

    To retrieve data files from subdirectories, provide the path up to the parent directory.
    Include sub-directories(Optional) Select this check box if you want to retrieve log files from subdirectories of the file path specified.
    Filename/Rollover Pattern

    Specify the file name only, or specify the file name with a rollover pattern to identify subsequent logs.

    You can use the following wild card characters:

    • Asterisk (*)—Can be used to substitute zero or more characters in the file name.
    • Question mark (?)—Can be used to substitute exactly one character in the file name.

    Specifying a rollover pattern can be useful to monitor rolling log files where the log files are saved with the same name but differentiated with some variable like the time stamp or a number.

    Note: Ensure that you specify a rollover pattern for identifying log files that follow the same data format (which means they will be indexed with the same data pattern).

    Scenario 1

    Suppose you want to collect log files saved with succeeding numbers once they reach a certain size; for example:

    IAS0.log

    IAS1.log

    IAS2.log

    Rollover pattern: In this scenario, you can specify the rollover pattern as IAS?.log.

    Scenario 2

    Suppose you want to collect log files that roll over every hour and are saved with the same date but a different time stamp in the YYYY-MM-DD-HH format; for example:

    2013-10-01-11.log

    2013-10-01-12.log

    2013-10-01-13.log

    Rollover pattern: In this scenario, you can specify the rollover pattern as 2013-10-01-*.log.

    Time Zone

    By default, the Use file time zone option is selected. This means the data is indexed as per the time zone available in the data file. If the data file does not contain a timezone, then the by default the time zone of the Collection Host (Collection Station or Collection Agent) server is used.

    You can also manually select a timezone from the list available. This timezone must match the timezone of the server from which you want to collect data. If your data file contains a timezone and you manually specify the timezone, then the manually specified timezone overrides the file timezone.

    Data Pattern
    Pattern

    Select the data pattern to use for indexing the data file.

    Use one of the following methods to specify the data pattern:

    • Filter the relevant data patterns that match the file: Click Auto-Detect to automatically find a list of matching data patterns. Click each of the date patterns displayed on the left, to see a preview of the sample records. By looking at the preview of records, you can understand how the data will be indexed and be made available for searching.
      Note: Before filtering the relevant data patterns by clicking Auto-Detect, under the Advanced Options section, ensure that the correct file encoding is set.
    • Manually select a data pattern: Manually scan through the list available, and select one of the data patterns available and click Preview to see the sample records parsed.

      Alternatively, select one of the following options and click Preview to see the sample records parsed:

      • Free Text with Timestamp: This option can be useful if you cannot find matching data patterns.
        This option uses the date format to capture the timestamp and the rest of the data appears in raw format.
      • Free Text without Timestamp: This option parses all the records as free text.
        This option can be useful in the following scenarios:
        • If you cannot find matching data patterns and matching date formats.
        • If you cannot find matching data patterns and your data does not contain a timestamp.

        Note: All the records processed using this option are assumed to be a single line of data with a line terminator at the end of the event. Records are distinguished on the basis of the new line separator.
        If you want to distinguish records in a custom way, then you can specify a custom string or regular expression in the Event Delimiter box that decides where the new line starts in the data. This string or regular expression must correspond to some text in your data which appears at the beginning of a line.

        The following regular expression distinguishes records when the line starts with "INFO" or "ERROR" or "WARN".

        ^(INFO|WARN|ERROR)

        The following regular expression distinguishes records when the line starts with “com.bmc.ola”.

        ^(com\.bmc\.ola)

    • Create a new data pattern: If you are not satisfied with the results arising out of your data patterns available, select Add Data Pattern available at the end of the list. By selecting this option, you are redirected to the Administration > Data Patterns page where you can create a new data pattern or customize an existing data pattern by cloning it.

    Note: If you select both – a pattern and a date format, the product uses the date format to index the timestamp and the pattern to index rest of the event data.

    Date Format

    When you select a data pattern, the matching date format is automatically updated. However, you can specifically find date formats matching the timestamp in your data file.

    Use one of the following methods to specify a date format:

    • Filter the relevant data formats: Click Auto-Detect to find automatically find a list of matching data patterns and date formats. If no matching data patterns are found, a list of matching date formats is displayed. You can click each of the date formats displayed on the left, to see a preview of the sample records.
      Note: Before filtering the relevant date formats by clicking Auto-Detect, under the Advanced Options section, ensure that the correct file encoding is set.
    • Manually select a date format: Manually scan through the list available, and select one of the date formats available. Click Preview to see the sample records parsed.
      Alternatively, from the Pattern list, select Free Text with Timestamp and click Preview to find the relevant data formats that match the file.
    • Create a new date format: If you are not satisfied with the results arising out of the date formats available, you can create a new date format. To do this, select the Create new Date Format option and manually enter the date format depending on the timestamp that you want to capture. For example, if your data file contains the timestamp, "28 Apr 2014 10:58:28", then your date format must be dd MMM yyyy HH:mm:ss.

    Notes:

    • If you select both – a pattern and a date format, then the date format specified takes precedence over the date format from the pattern that you selected. So the timestamp is indexed as per the specified date format, and the rest of the data is indexed as per the pattern.
    • If you select only a date format, then the date format is used for indexing the timestamp, while the rest of the data is displayed in a raw format in your search results.
    Date Locale

    You can use this setting to enable reading the date and time string based on the language selected. Note that this setting only applies to those portions of the date and time string that consist letters (digits are not considered).

    By default, this value is set to English.

    You can manually select a language to override the default locale. For a list of languages supported, see Language information.

    Poll Interval (mins)

    Enter a number to specify the poll interval (in minutes) for the log collection.

    By default, this value is set to 1.

    Start/Stop Collection(Optional) Select this check box if you want to start the data collection immediately.

    File Encoding

    If your data file uses a character set encoding other than UTF-8 (default), then do one of the following:

    • Filter the relevant character set encodings that match the file.
      To do this, click Filter relevant charset encoding next to this field.
    • Manually scan through the list available and select an appropriate option.
    • Allow IT Data Analytics to use a relevant character set encoding for your file by manually select the AUTO option.

    Ignore Data Matching Input

    (Optional) If you do not want to index certain lines in your data file, then you can ignore them by providing one of the following inputs:

    • Provide a line that consistently occurs in the event data that you want to ignore. This line will be used as the criterion to ignore data during indexing.
    • Provide a Java regular expression that will be used as the criterion for ignoring data matching the regular expression.

    Example: While using the following sample data, you can provide the following input to ignore particular lines.

    • To ignore the line containing the string, "WARN", you can specify WARN in this field.
    • To ignore lines containing the words both "WARN" and "INFO", you can specify a regular expression .*(WARN|INFO).* in this field.
    Sample data
    Sep 25, 2014 10:26:47 AM net.sf.ehcache.config.
    ConfigurationFactory parseConfiguration():134
    WARN: No configuration found. Configuring ehcache from 
    ehcache-failsafe.xml  found in the classpath:
    
    Sep 25, 2014 10:26:53 AM com.bmc.ola.metadataserver.
    MetadataServerHibernateImpl bootstrap():550
    INFO: Executing Query to check init property: select * 
    from CONFIGURATIONS where userName = 'admin' and 
    propertyName ='init'
    
    Sep 30, 2014 07:03:06 PM org.hibernate.engine.jdbc.spi.
    SqlExceptionHelper logExceptions():144
    ERROR: An SQLException was provoked by the following 
    failure: java.lang.InterruptedException
    
    Sep 30, 2014 04:39:27 PM com.bmc.ola.engine.query.
    ElasticSearchClient indexCleanupOperations():206
    INFO: IndexOptimizeTask: index: bw-2014-09-23-18-006 
    optimized of type: data
    Best Effort Collection

    (Optional) If you clear this check box, only those lines that match the data pattern are indexed; all other data is ignored. To index the non-matching lines in your data file, keep this check box selected.

    Note: Non-matching lines in the data file are indexed on the basis of the Free Text with Timestamp data pattern.

    Example: The following lines provide sample data that you can index by using the Hadoop data pattern. In this scenario, if you select this check box, all lines are indexed. But if you clear the check box, only the first two lines are indexed.

    Sample data
    2014-08-08 15:15:43,777 INFO org.apache.hadoop.hdfs.server.
    datanode.DataNode.clienttrace: src: /10.20.35.35:35983, dest: 
    /10.20.35.30:50010, bytes: 991612, op: HDFS_WRITE, cliID:
    
    2014-08-08 15:15:44,053 INFO org.apache.hadoop.hdfs.server.
    datanode.DataNode: Receiving block blk_-6260132620401037548_
    683435 src: /10.20.35.35:35983 dest: /10.20.35.30:50010
    
    2014-08-08 15:15:49,992 IDFSClient_-19587029, offset: 0, 
    srvID: DS-731595843-10.20.35.30-50010-1344428145675, 
    blockid: blk_-8867275036873170670_683436, duration: 5972783
    
    2014-08-08 15:15:50,992 IDFSClient_-19587029, offset: 0, 
    srvID: DS-731595843-10.20.35.30-50010-1344428145675, 
    blockid: blk_-8867275036873170670_683436, duration: 5972783

    Host Key Fingerprint

    (Optional) Provide the fingerprint of the RSA host key to connect with the server from which you want to retrieve the data.

    This is the host key that is configured to be used by the SSH server with which you want to connect.

    Example: bc:e1:44:56:bd:b1:4d:b9:6f:4c:a4:ca:07:69:5c:66

    Tip: To get the RSA host key fingerprint, you might want to contact your SSH server administrator.

    For more information, see About the SSH host key fingerprint (BMC contributor page).

    Log File Contains Header

    (Optional) Providing this value is mandatory only if you are trying collect a file that contains a constant header which must not be indexed.

    The value must be the actual header appearing in the data.

    Log File Contains Footer

    (Optional) Providing this value is mandatory only if you are trying collect a file that contains a constant footer which must not be indexed.

    The value must be the actual footer appearing in the data.

    Inherit Host Level Tags From Target Host(Optional) Select this check box to inherit your tag selections associated with the target host that you selected earlier. This option is not applicable if you did not select a target host. Note: After selecting this check box, you can further manually select additional user groups. When you manually select additional user groups, both the inherited permissions as well as the manually assigned permissions are applied. To remove the inherited permissions, clear this check box.
    Select Tag name and corresponding value

    (Optional) Select a tag name and specify the corresponding value by which you want to categorize the data collected. Later while searching data, you can use these tags to narrow down your search results.

    Example: If your are collecting data from hosts located at Houston, you can select a tag name for "Location" and in the value specify "Houston". While searching the data, you can use the tag, Location="Houston" to filter data and see results associated with the Houston location.

    To be able to see tag names, you need to first add them by navigating to Administration > System Settings.

    To specify tag names and corresponding values, in the left box select a tag name and then type the corresponding tag value in the right box. While you type the value, you might see type-ahead suggestions based on values specified in the past. If you want to use one of the suggestions, click the suggestion. Click Add to add the tag name and corresponding value to the list of added tags that follow. Click Remove Tag to remove a tag.

    The tags saved while creating the data collector are displayed on the Search tab, under the Filters panel, and in the Tags section.

    Note: At a time, you can specify only one value for a tag name. To specify multiple values for the same tag name, each time you need to select the tag name, specify the corresponding value, and click Add.

    For more information about tags, see Understanding fields.

    Inherit Host Level Access Groups From Target Host(Optional) Select this check box to inherit your group access configurations associated with the target host that you selected earlier. This option is not applicable if you did not select a target host.

    Note: After selecting this check box, you can further manually select additional user groups. When you manually select additional user groups, both the inherited permissions as well as the manually assigned permissions are applied. To remove the inherited permissions, clear this check box.
    Select All Groups

    (Optional) Select this option if you want to select all user groups. You can also manually select multiple user groups.

    Notes: You can access data retrieved by this data collector based on the following conditions.

    • If user groups are not selected and data access control is enabled: Only the creator of the data collector can access data retrieved by this data collector.
    • If user groups are not selected and if data access control is not enabled: All users can access data retrieved by this data collector. You can restrict access permissions by selecting the relevant user groups that must be given access permissions. To enable data access control, navigate to Administration > System Settings.

    For more information, see Managing user groups in IT Data Analytics.

  5. Click Create to save your changes.

What to do if an error occurs

To understand the troubleshooting scenarios related to this data collector, see Troubleshooting common issues with the Category filter set to Data collection.