Troubleshooting connector installation


Log events are not generated when the threshold is breached due to a time zone mismatch

Issue symptom

This issue might occur because of the time zone mismatch between the application log time zone and the connector host.

Issue scope

The log events are not generated in BMC Helix Operations Management even though the log count exceeded the defined threshold.

Example

The application logs are generated in the GMT +2:00 time zone and the host where the connector is installed is in the GMT +3:30 time zone.

The 1 hour 30 minutes difference causes the connector to parse logs with the incorrect time stamp, treating the logs as older data based on epoch time. The logs could not meet the alert conditions because they appeared to fall outside the current time.

Resolution

You can manually add the time zone parameter to the connector file to resolve log event failures due to a time zone mismatch.

Perform the following steps to add the time zone parameter in the Fluentd file_log_pipeline.conf file.

  1. Navigate to the configuration file.

    • Windows path
      <Install_Dir>\opt\td-agent\etc\data\<Policy_ID_Random_Number>\pipeline\file_log_pipeline.conf
    • Linux path
      /opt/td-agent/etc/data/<Policy_ID_Random_Number>/pipeline/file_log_pipeline.conf

    Validate the log file path under the tail plugin path section to ensure the file corresponds to the correct policy.

  2.  Modify the configuration file.

    1. Open the file_log_pipeline.conf file.
    2. Locate the first occurrence <parser> parameter, which is of type multiline or regexp.
    3. Add the time zone parameter in the next line of the time_format entry.
    4. Set the time zone value to match the connector host time zone. In this scenario, GMT +3:30.

    The following are examples of modifying multiline and regex configuration flies:

    • Multiline

      <parse>
          @type multiline
      format_firstline /\d{4}\d{2} \d{2}:\d{2}:\d{2}/
      format1 /(?<time>\d{4}\d{2} \d{2}:\d{2}:\d{2})\s+(?<msg1>.*)/
      time_format %F %T
      timezone +0000
      </parse>
      </source>
    • Regex

      <parse>
        @type regexp
        expression /(?<time>\d{4}\d{2}\d{2} \d{2}:\d{2}:\d{2})\s+(?<message123>.*)/
         time_format %F %T
         timezone +0000
      </parse>
  3. Restart the connector service.
    1. Stop and start the Fluentd connector service to apply the changes.

Important

If you update the collection policy after changing the time zone setting in the Fluentd file_log_pipeline.conf file, it can overwrite your time zone changes. You must reapply the time zone parameters after updating the collection policy to retain the changes.

Unable to start Linux (CentOS 7.9) connector

You get the following error when you start the connector:

Job for td-agent.service failed because the control process exited with error code.
See "systemctl status td-agent.service" and "journalctl -xe" for details. 

Issue symptom

This issue might occur because the libyaml file is missing on the host computer.

Issue scope

This issue might impact all Linux (CentOS 7.9) connectors.

Resolution

Run the following command to verify whether the library is present on the host computer:

cd to  /opt/td-agent/bin/

/opt/td-agent/bin/ruby fluent-gem list

If the file is not present, you get the following output:

/opt/td-agent/lib/ruby/2.7.0/yaml.rb:3: warning: It seems your ruby installation is missing psych (for YAML output).
To eliminate this warning, please install libyaml and reinstall your ruby.

Install the libyaml-0.1.4 file on the host computer and then start the connector.

If the libyaml file is present on the host computer, it appears in the gems list. To resolve your issue, contact BMC Support.

[root@<hostname> bin]# /opt/td-agent/bin/ruby fluent-gem list
*** LOCAL GEMS ***

addressable (2.8.0)
async (1.30.2)
async-http (0.56.5)
async-io (1.33.0)
async-pool (0.3.10)

Log events do not get correlated to the correct host

The results of root cause analysis are incorrect because of this issue.

Issue symptom

This issue occurs because in the Log explorer, the name of the originating server is not displayed in the host.name field. 

Issue scope

This issue affects all connector types and all collection policies.

Resolution

The log_source_host field now replaces the host.name field for all connector types. For this change to reflect in the collection policies, update the collection policies by performing the following steps:

  1. Go to Collection > Collection Policies.
  2.  Against a policy, click the Action menu and click Edit.
  3. On the edit policy page, click Save.
    You do not need to perform any other action.
  4. Repeat these step for all collection policies.

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*