Troubleshooting Java memory management
This topic describes some of the issues and troubleshooting related to TrueSight Infrastructure Management Java memory management. The goal is to help you avoid potential performance impacts.
Java memory management - Useful links from BMC Communities
- Webinar Recording - Monitoring Java Memory Utilization in TrueSight Operations Management
- java.lang.OutOfMemoryError: unable to create new native thread error seen in the TrueSight.log on Linux TSPS Server 11.0
- TSIM jserver fails to start with java.lang.OutOfMemoryError: unable to create new native thread
- TSPS Server crashing with java.lang.OutOfMemoryError: GC Overhead limit exceeded errors
Memory issues can appear on almost any component, but the most common areas of concern are pserver, jserver, csr, index server, and agent controller.
If you experience performance and/or memory issues, run the TrueSight Health Check Tool. See TrueSight Health Check Tool.
Infrastructure Management Maintenance Tool
The Infrastructure Management Maintenance Tool utility is included with the Infrastructure Management installation, and is made up of tabbed pages that provide options for administering Infrastructure Management. You can use this utility to:
- View installation and configuration log files
- Locate and view other log files on your system
- Zip and send files to BMC Customer Support
- Run the post-installation health check
- Set an encrypted password
- Run the memory usage and disk space check
- Update performance tuning parameters
Administering Infrastructure Management Maintenance Tool - Documentation link
Tune Parameters tab
You can use the Infrastructure Management Maintenance Tool Tune Parameters tab to configure the performance tuning parameters.
Perform the following steps to configure the performance tuning parameters using the Tune Parameters tab:
- Start Infrastructure Management Maintenance Tool.
- Click the Tune Parameters tab.
- Click Tune Parameters if you want to tune server performance parameters after modifying the parameter values in the ServerPerformanceParameters.csv file located in the pw\custom\conf directory.
- When the performance tuning is completed, restart all the Infrastructure Management services.
Diagnosing and reporting an issue
After you identify the symptoms and scope of the issue, use the troubleshooting guide to help diagnose and resolve the issue or to contact Customer Support.
Action | Steps |
---|---|
Jserver process crash due to insufficient memory: If the process has crashed due to insufficient memory then a memory dump file will be created in <PnServerPath>\pw\pronto\logs directory with .hprof extension. | In case of a JVM crash, a core dump file with this naming convention 'hs_err_pid*.log' will be created under <PnServerPath>\pw\pronto\tmp. |
Jserver process in hung state | Determining this state is not straight forward, typically Operations/Admin console could become unresponsive because the requests to jserver take more time to respond. Please not this is a possibility and may not be an exact symptom. We can collect the thread dump in this case. In windows : pw threaddump jserver In solaris : kill –QUIT <pid> In windows, a file by name jserverTD.out will be created under <PnServerPath>\pw\pronto\logs directory. In case of solaris, the output will be captured in <PnServerPath>\pw\pronto\logs\jserver.out. |
How to enable debug logs for jserver process | The command to enable debug of jserver process is pw debug on -p jserver [-s <subsystem>] Enabling debug for all the subsystem is expensive and the log files will be flooded with messages. So it is always advisable to enable for the required subsystems. The subsystem to be enabled should be determined based on the nature of the issue. Following command can be used to find the various subsystems under jserver process pw debug list -p jserver On enabling debug, the log messages will be captured in <BPPM_HOME>\pw\prontp\logs\debug\jserver.log. The debug should be turned off after collecting the logs as this is an expensive operation. |
Resolutions for common issues
Slow performance in TrueSight Infrastructure Management
If you are seeing poor GUI performance and jserver is consuming high levels of CPU when attempting to navigate the tool.
Slowness can have many causes, so Support will ask for hardware specifications as well as load on the application, and if there are any other outside factors (patching,other applications on the box, network connectivity issues, etc). The goal is to get an overall view of your environment to help narrow down the focus of the search.
While there may be a variety of factors, one way to clear up slowness is to enable lazyloading. This is a property which can be set in order to allow more effective loading of objects in the console.
Certain performance problems can be observed within the navigation tree of the Ops Console. Examples of this could be:
- Navigation tree does not load at all
- Navigation tree only partially loads
- Navigation tree takes a long time to load
- Operations Console becomes inaccessible and the number of https processes increases
Setting the lazy loading parameter to true can help to resolve these problems. The performance problems seen in the navigation tree can be caused by a number of things such as:
- High number of events
- High number of dynamic collectors
- High number of CIs in the service model
- Depth of service model
- Large number of component folders
- Large number of event folders
We recommend that you uncheck those items in the navigation tree preferences that you are not interested in. For example, if you are interested in events only, then uncheck the boxes related to service model and folders.
This is because of the way the navigation tree works, in that data is loaded in the order of top to down. So, it will be event collectors first, then groups, then service model CIs, then component folders, then event folders. All of this can take time as they have to be loaded from various sources such as cell, jserver, backend database, and it can result in a bottleneck.
The lazyloading option says to load only the top-level elements. Then as soon as a user wishes to expand that top-level it will fetch the data at that point. That is why traversing from top-level down can seem slower, but it is greatly reducing the load on the jserver.
To turn the lazyloading option on, add the following property to the pw/custom/conf/pronet.conf file:
pronet.navtree.lazyloading=true
Then reload the jserver properties with command:
pw jproperties reload
Then monitor the performance of the TrueSight Infrastructure Management Operations Console.
If the performance continues to be an issue, there is another property to set which can also help with TrueSight Infrastructure Management slowness, the nearcache property.
Add the following properties to the pw/custom/conf/pronet.conf
pronet.hotrod.client.rate.custom.nearcache.enable=false
pronet.hotrod.client.jserver.custom.nearcache.enable=true
pronet.tsim.console.data.fetch.optimized=false
Then restart the TrueSight Infrastructure Management server (pw system start). These properties will help to solve performance issues in the Operations Console and Administration Console and these properties are set by default in TrueSight 11.3.01
If the issue remains after enabling the properties above, collect the pw dump 1 output and send the details to Support for further assistance.
TrueSight Infrastructure Management component stuck in the “Initializing” state
While there may be many factors involved in this type of situation, it is always best to send the logs to Support to review them for other possible causes of the issue.
Due to the elastic search errors below the component was unable to communicate (sync) with TSPS which caused it to fail.
INFO 06/21 09:32:53.393 [EventMsg_28] TsimAudit Failed to insert events to ES
java.util.concurrent.ExecutionException: RemoteTransportException[[ZbkOM4q][127.0.0.1:9300][cluster:admin/ingest/pipeline/put]];
nested: ScriptException[compile error]; nested: IllegalArgumentException[Scripts may be no longer than 16384 characters. The passed in
script is 21057 characters. Consider using a plugin if a script longer than this length is a requirement.];
This issue with elastic search is fixed in TrueSight 11.3.01. A workaround is to delete the elastic search data folder and restart the Presentation Server. The elastic search folder can be found here: <TSPS_HOME>/modules/elasticsearch/
Delete the entire folder as it will be recreated upon restart of the Presentation Server.
If clearing the elastic search folder does not allow the TrueSight Infrastructure Management component to initialize, gather up the output of the tssh dump export command and send it to Support for further review.
Question |
|
---|
Answer |
|
---|
Comments
Log in or register to comment.