Troubleshooting Compliance job performance issues


A TrueSight Server Automation Compliance Job is running slower than expected, or timing out, against one or more Target Servers.

This topic helps the reader to locate and review the appropriate log files to determine why the Compliance job is running slowly and either identify and resolve the issue or create a BMC Customer Support case.

Issue symptoms

  • A TrueSight Server Automation Compliance Job is running slower than expected against one or more Target Servers with errors.
  • The Compliance Job may also be encountering a JOB_TIMEOUT or a JOB_PART_TIMEOUT due to the longer than expected run time.

Issue scope

The issue may be related to specific Target Servers, specific rules or may be an overall Compliance Job performance issue.

Diagnosing and reporting an issue

Task

Action

Steps

Reference

1

Understand the problem scope.

  • Is this Compliance Job using a custom (customer-created) Component Template or BMC-provided Component Template?
  • If the Component Template is BMC-provided content, what is the name and version of the Component Template?(refer to screenshot on the right for details on how to determine the Template name and version) For example,
    CIS Windows 2019 (Version 1.1.0)
    CIS - Red Hat Enterprise Linux 8 (Version 1.0.0)
    PCI Data Security Standard v3 - Windows Server 2016 (Version 3.2.1)
    PCI Data Security Standard v3 - Red Hat Enterprise Linux 7 (Version 3.2.1)
  • Is the Compliance Job performing Auto-Discovery or was a previous Component Discovery Job run?
  • What Compliance Job rule(s) are running slower than expected?
  • What is the Operating System vendor and version of the Target Servers? For example, MS Windows 2016, RHEL 7 and so on.
  • How many total Target Servers is the Compliance Job running against?
  • What is the job parallelism set to?
  • Are the performance issues happening against all the Target Servers or a subset of the Target Servers?
    • If a subset, how many affected Target Servers?
    • What are the names of some of the affected Target Servers?
  • What is the version of the TSSA Application Server?
  • What are the versions of the affected RSCD Agents?
  • Is the database Oracle or SQL Server?
  • Is the performance problem intermittent, occurring only on some runs, or consistent?
  • Does the Compliance Job have a JOB_TIMEOUT and/or JOB_PART_TIMEOUT property value defined? If so, what are the values for each?

Template name and version details:

Teamplate_version_identifier.png



2

Review Component Template for unnecessary template parts.

During a Compliance Job run, the following order of events occur:

1) All Template Parts are first collected from the Target.

2) Compliance Rules are evaluated.

A common cause of Compliance performance problems can be that the Component Template has Template parts defined which are not used by Compliance Rules.

This can result in the Template Part being unnecessarily collected from the Target Servers when they are not required for Compliance Rule evaluation.

Review the Template Parts of the Component Template and remove any parts not used by Compliance Rules.


3

Test the individual Compliance Rule from the TrueSight Server Automation Console

Can the performance issue be reproduced using the "Test Rule" functionality of the TrueSight Server Automation Console?

Testing the rule from the TrueSight Server Automation Console allows the user to reproduce and troubleshoot the behavior outside the context of a Compliance Job.

See steps in the Reference section on the right for details on testing a Compliance Rule from the TrueSight Server Automation Console.


  1. Open the Component Template.
  2. Open the Rule which appears to be displaying incorrect results
  3. Click Play to test the rule.

    10.png
  4. Add the Component or Target Server (you need to do this only once).

    11.png

  5. Click Run Test.

    12.png

  6. Review the results and the performance of the Rule Test.

For more information, see Testing-a-compliance-rule.

4

Test the command(s) run by the rule directly on the Target Server

If the Compliance Job performance issues are confined to specific rule(s)on specific Target(s) , can the commands used by the rules be run directly on the Targets?

If so, does this manual run encounter the same performance issues?

This step may not apply if the performance issues are not related to specific rules/Targets but are a general issue with the Compliance Job.

For example, if the Compliance Rule is checking the permissions of a file, this can be validated directly on the Target Server and also via a Live Browse from the TrueSight Server Automation Console. For example,

  1. Login to the impacted Target Server OR Live Browse the Target Server from the TrueSight Server Automation Console
  2. Go to the file location in question For example, cd /etc
  3. Check the permissions on the file
    In this example, it is 000.

See reference section on the right for an example of checking file permissions directly from a Target Server and from the TrueSight Server Automation Console.

Different Compliance Rules will check for other conditions which can similarly be tested directly on the Target. For example,

  • Results of a script executed via an Extended Object
  • Does a file exist
  • Does a registry entry exist
  • checksum of a specific file
  • Value of a registry entry
  • ownership of a file
  • Presence of a specific entry in a Configuration file

Checking file permission directly from Target Server:

5.png

To check the file permission by Live Browsing the Target Server:

  1. Right click the target server and select Browse.

    7.png
  2. Go to 'File System' > /etc/shadow and validate the permissions.

    8.png

    9.png

5

Enable Debug Logging

The log4j.properties or log4j2.xml file (depending on the version) on an Application Server can be modified to enable debug-level logging for two specific Compliance Level logger classes.

See detailed steps in reference section on the right.

An Application Server restart is not required after enabling this debug.

For simplicity, the debug-level logging can be enabled on just one Application Server and a Job Routing rule can then be added to route the next run of the Compliance Job to this specific Application Server.

Once the additional debug-level logging has been enabled, and the job routing rule has been created, rerun the Compliance Job to generate the additional debug-level logging in the Application Server log.

Logging properties file location:

<installation_directory>/NSH/br/deployments/<deployment>

  • Log4j.properties (TrueSight Server Automation version 8.9.04 P3 or earlier)
    • log4j.logger.com.bladelogic.om.infra.ast.visitor.conditionresult.evaluator.NewCondition
      ResultSuccessVisitor=Debug
    • log4j.logger.com.bladelogic.om.infra.model.asset=Debug
  • Log4j2.xml (TrueSight Server Automation version 20.02)
    • <Logger name="com.bladelogic.om.infra.ast.visitor.conditionresult.evaluator.NewCondition
      ResultSuccessVisitor" level="DEBUG"/>
    • <Logger name="com.bladelogic.om.infra.model.asset" level="DEBUG"/>

See KA 000389694 for additional details.

6

Generate Compliance Job Log Package

Generate the Compliance Job Log Package for review by BMC Customer Support.

Right-click a failed Compliance Job Run (where debug-level logging had been enabled in step 4) and select "Download Log Package" to capture the required logs. (refer to screenshot on the right for details)

  1. Right-click the job run and select Download Log package.
    Step1.jpg
  2. Specify a location on the local system:
    Step2.jpg
  3. Filter the list of Target Servers by status to select the Target Servers from which to collect agent logs.
    • All Success
    • All Failure
    • All Warning
  4. Select the desired set of targets:
  5. Select the Save button in order to begin the download process in the background. The download progress can be tracked in the bottom right corner of the console:
    step5.jpg

Once process is complete it will show a popup window confirming the logs are downloaded/generated:

Reference Video:

7

Analyze log files to understand where the performance issues(s) were encountered.

Analyze log files to understand where the performance issues(s) were encountered.

Identify a specific WorkItem-Thread which is executing the slow Compliance rule on a slow Target Server and follow the debug log entries for that WorkItem-Thread to determine the bottleneck.

If unable to identify and resolve the problem, see Step 8 to create a BMC Support Case.

See KA 000389694 for additional details.

8

Creating a BMC Support Case.

Provide the following information and log files when creating a case with BMC Customer Support:

  • Scope of the issue as identified in step 1 above
  • Results of the "Test Rule" performed in step 3
  • Results of the tests performed directly on the Target Server and via Live Browse in step 4 (if applicable)
  • Job Log Package generated in step 6


 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*

TrueSight Server Automation 21.02