Troubleshooting performance issues


This topic provides information about the steps that you can take to troubleshoot various performance issues.   

Troubleshooting overall product performance issues

Symptoms:

  • Java heap OutOfMemoryError (%BMC_ITDA_HOME%\logs)
  • GC overhead limit exceeded error (%BMC_ITDA_HOME%\logs)
  • Slow operations (UI/search)

Solution: Use one or more of the following steps to troubleshoot overall performance issues.

  • Ensure that the maximum Java heap sizes are properly set for each component.

    For more information, see Component configuration recommendations.
  • Ensure that the system has enough physical RAM to prevent paging and swapping. Having additional RAM is beneficial, because the Indexer uses OS-level disk file caching, which can significantly increase performance. For more information, see Sizing and scalability considerations .
  • Check the CPU utilization of the computer on which the product is installed. If the CPU utilization is extremely high, first determine which component has the highest CPU utilization, and then go to one of the following sections to troubleshoot for that specific component:
  • Ensure that you have enough CPUs for the expected load. For more information, see Sizing and scalability considerations.
  • Verify that your Disk I/O sub system has adequate performance and is not acting as a bottleneck.
  • Verify that the data per index (defined by the intervalInHrs property) is properly set for both the Collection Station and the Console Server + Search. For more information, see Component configuration recommendations.
  • Ensure that you do not have an antivirus software actively scanning files or another third-party application running on the same system as the product components, so that the product is not competing for resources.
  • Check network connectivity between the web browser and the IT Data Analytics server.
  • Refresh the web browser to make certain that it is running the latest code.
  • Restart the web browser.
  • Verify that the computer running the web browser has enough physical RAM available and that CPU cycles are available.
  • Check the itda.log file at %BMC_ITDA_HOME%\logs (in the event of high CPU utilization) to see the number of concurrent searches that are in progress.
     If it appears that too many searches are running concurrently, you might want to reduce the number by using one or more of the following suggestions:
    • Execute memory-intensive searches during off-peak hours. Such searches include search strings that are likely to process a large number of search results, complex search strings, or search strings containing generic key words followed by multiple search commands.
    • Distribute the scheduled executions for notifications so that they are do not triggered at the same time.
    • Reduce the number of viewlets that you have added.

Troubleshooting Indexer performance issues

The following list illustrates the Indexer-related performance issues that you are likely to face.

Symptom: High CPU utilization on the server hosting the Indexer

Solution: Use one or more of the following steps to troubleshoot the Indexer performance issues:

  • Check the Collection Station’s collection.log file (located at %BMC_ITDA_HOME%\station\collection\logs) to find the following error:
    org.apache.flume.ChannelException: The channel has reached it's capacity. This might be the result of a sink on the channel having too low of batch size, a downstream system running slower than normal, or that the channel capacity is just too low.
    Ensure that the intervalInHrs property is properly set based on the expected load. By default, every index is optimized every 2 hours. If you have 6-hour interval indices and too much data in the index, the optimizer takes a long time to run and significantly degrades the overall system performance. For more information, see Component configuration recommendations.
  • Even if you do not see the OutOfMemoryError or the GC overhead limit exceeded error in the log file, the Indexer might still have higher CPU utilization due to excessive Java garbage-collection activity. This issue can occur if the maximum Java heap size setting is too low and needs to be adjusted. For more information, see Component configuration recommendations.
  • Remove high cardinality fields (fields with large number of unique values) from the Filters panel on the Search tab.
  • Reduce the number of fields that are calculated in the Filters panel on the Search tab.
  • Narrow the time range for your search query.
  • Make it a practice to explicitly pause or stop search queries that are processing a large quantity of results or that take a long time to complete, when you find that you have enough data.
  • While using search commands with the by parameter, make sure that the by field has low cardinality (a low number of unique values).
  • Reduce the number of concurrent searches.
  • When searching, try to reduce the number of matches first, before piping the data to the search commands.
  • For data collectors containing a high volume of data, consider using the Ignore Data Matching Input option available while creating or modifying the data collector, under Advanced Options. For more information, see Collecting data into the system .
  • If you notice a correlation between high CPU utilization and the anomaly baseline job running every 15 minutes on the search server(s), then consider horizontal scaling  of the Indexer.

Restart the Indexer after making any configuration changes. For more information, see Starting or stopping product services.

Symptom: Out Of Memory Error for the Java heap size in the Indexer logs

Solution: Use one or more of the following steps to troubleshoot the Indexer performance issues:

Troubleshooting Collection Station performance issues

Symptom: High CPU utilization on the server hosting the Collection Station.

Solution: Use one or more of the following steps to troubleshoot the Collection Station performance issues:

  • If you notice that the Collection Station’s CPU is taking up most of the CPU resources, it is possible that for some reason the Indexer was not available. Check the Collection Station’s collection.log file (located at %BMC_ITDA_HOME%\station\collection\logs) to see if it had problems sending events to the Indexer. Check the Indexer’s log file (located at %BMC_ITDA_HOME%\logs\indexer) for clues as to why it was too busy or for why the process went down.
  • Check the Collection Station’s collection.log file and verify that the Collection Agent’s flume channel is not full (that is, the Indexer is able to keep up with the rate of data being sent by the Collection Agents). If the Indexer is unable to keep up, ensure that intervalInHrs is properly set based on the expected load. By default, every index is optimized every 2 hours. If you have 6-hour interval indices and too much data in the index, the optimizer takes a long time to run and significantly degrades the overall system performance. For more information, see Component configuration recommendations.
  • Check the Collection Station’s collection.log file to see if the OutOfMemoryError or GC overhead limit exceeded errors occurred. If such errors did occur, increase the Collector Station’s maximum Java heap size. For more information, see Component configuration recommendations.
  • If you notice that a large number of notifications with PDF reports are getting generated, then consider reducing the frequency of these notifications. Additionally, you can also consider reducing the number of notifications containing PDF reports.

Restart the Collection Station after making any configuration changes. For more information, see Starting or stopping product services.

Troubleshooting Console Server or Console Search performance issues

Symptom: High CPU utilization on the server hosting the Console Server.

Solution: Use one or more of the following steps to troubleshoot the Console Server performance issues:

  • Check the itda.log file (at %BMC_ITDA_HOME%\logs) for the OutOfMemoryError or the GC overhead limit exceeded error. If you find either of these errors, increase the Console Server’s maximum Java heap size. For more information, see Component configuration recommendations.
  • Reduce the number of fields that are calculated in the Filters panel on the Search tab.
  • Narrow the time range for your search query.
  • Make it a practice to explicitly pause or stop search queries that are processing a large quantity of results or that take a long time to complete, when you find that you have enough data.
  • When using search commands with the by parameter, ensure that the by field has low cardinality (a low number of unique values).
  • Reduce the number of concurrent search commands.
  • If the data collector’s polling interval (default, 1 minute), is consistent with the scheduled poll frequency, and you previously increased the collection.thread.pool.size property to a value much higher than the default value, then try decreasing this value. 
    For more information, see Modifying the configuration files.
  • Reduce the number of notification alerts and reports.

Restart the Console Server after making any configuration changes. For more information, see Starting or stopping product services.

Symptom: The anomaly baseline job run takes longer than 7 minutes to complete.

Solution: Use the following to troubleshoot the Console Search performance issue:

  • Check the console server’s itda.log file for Job completed.
  • If the Indexer has high CPU utilization, consider scaling the Indexer horizontally.
  • Consider scaling the number of servers running the Search component.

Troubleshooting Configuration Database performance issues

SymptomSolution
High CPU utilization on the server hosting the Configuration Database.Increase the polling interval of the data collectors configured.
  • Configuration Database process ends
  • OutOfMemoryError (%BMC_ITDA_HOME%\logs\services\configdb.log)

Increase the maximum memory required for the Configuration Database

For more information, see Component configuration recommendations for horizontal scaling.

When you are performing any operation (adding, editing, cloning, and so on) related to data collectors, the Configuration Database goes down
  • Reduce the number of tags added in the data collector
  • Reduce the number of fields specified in the data pattern that is being used by the data collector

Troubleshooting Collection Agent performance issues

Symptom: Recent data for a collector is not available in the search result and the Collection Agent log file has the following error messages:

Failed to push events to sink

Caused by ChannelFullException

Solution: Configure the Collection Agent to replace the flume.conf file. For more information, see Setting up Collection Agents.

Troubleshooting data collection falling behind schedule

Symptom: Poll interval of the data collectors is falling behind schedule.

Solution: Use the following steps to resolve this issue:

  1. In the Collection Station’s agent.properties file, increase the value of the collection.thread.pool.size property.
    For more information, see Modifying the configuration files.
  2. Decrease the intervalInHrs setting for number of hours data per index (for both the Console Server and the Collection Agent).
    For more information, see Component configuration recommendations .
  3. Increase the Indexer’s maximum Java heap size setting.
    For more information, see Component configuration recommendations.

Troubleshooting the issue where data collector is not created

Symptom: The data collector is not created after you add a new data collector template in an existing collection profile, and apply it on the host.

Solution: Delete all the data collectors that the profile has previously created, and apply the collection profile again.

Was this page helpful? Yes No Submitting... Thank you

Comments