How to resolve QPMON utilizing high CPU
Adjustments to monitoring and memory settings may be needed when encountering one or more of the following issues:
- QPMON is utilizing high CPU
- Some queue managers display the data in doubt warning on their physical views
- Objects are unexpectedly removed from the monitoring configuration
- The temporary dynamic reply queues used by the MQ Monitoring Extension (with names like QPasa.XXXX) fill up
It is always recommended to upgrade the product (services and agents) to the latest version and apply all available maintenance packs. If that's currently not an option, then apply as many of the configuration changes explained below as possible/required.
The first step is to increase the sample interval to the longest period of time that meets your monitoring requirements. In the 8.0.00 release, the default sample interval value was doubled to 60 seconds, reducing CPU used by qpmon and MQ on qpmon's behalf by half (in previous releases the default was 30 seconds). If your agent installation has been upgraded from a release earlier than 8.0.00, the setting can still be set to the previous default of 30 seconds, in which case we recommend changing that to at least 60 seconds. Some users find that 120 seconds or even 300 seconds (5 minutes) provides an acceptable balance between resource consumption and monitoring frequency.
agentpref.sh --set "WebSphere MQ Monitor" SampleInterval 60
All platforms excluding z/OS
Run through the following configuration options:
- Check the OpenOutputCount on the Child Object Summary table to note the number of queue handles. Values greater than 20000 may be problematic, so consider turning QueueHandleMonitoring off with the following command:
agentpref.sh --set "WebSphere MQ Monitor" QueueHandleMonitoring off
- Check the tempDyn queue depth using the following command:
dis ql(QPASA*) curdepth
If the queue depth is high, raise the MAXDEPTH of the SYSTEM.DEFAULT.MODEL.QUEUE so TMTM's tempDyn queues have more headroom.
- Clear the checkpoint.bin and rtsp.xml files:
- Stop the agent and extensions.
- Remove or rename checkpoint.bin and rtsp.xml, which both reside in the agent directory.
- Start the agent and extensions.
- Turn off any parameters that might cause high CPU. Note that this includes turning off auto-discovery.
agentpref.sh --set "WebSphere MQ Monitor" DiscoverQueueManagers off
agentpref.sh --set "WebSphere MQ Monitor" ChannelBatchModeMonitoring off
agentpref.sh --set "WebSphere MQ Monitor" QueueStatusMonitoring off
agentpref.sh --set "WebSphere MQ Monitor" QueueBatchModeMonitoring off
- When not monitoring Telemetry or Telemetry is not installed, turn Telemetry Monitoring off using the following command:
agentpref.sh --set "WebSphere MQ Monitor" MQTTMonitoring off
- As a last resort, you can disable the discovery of MQ objects, but this should only make a difference if there are less than 10% of objects monitored.
agentpref.sh --set "WebSphere MQ Monitor" DiscoverObjects off
When done, restart the agent and extensions.
For AIX only
If the issue is occurring with agents running on AIX, check AIX memory settings for the TMTM agent and extensions and the memory usage from qpmon with nmon.
dump -X64 -o qpea dump -X64 -o qpmon dump -X64 -o qpcfg
In addition, if the result for maxData is 0x00000000 (256MB) ,0x20000000 (512MB) or 0x40000000 (1024MB) you can consider increasing this with the ldedit commands below (up to MAXDATA of 2 GB).
The defaults for 7.0 and 8.0.00 are bmaxdata:0x20000000. Vesions 8.0.01 and 8.1.00 use bmaxdata:0x40000000 as the default.
ldedit -bmaxdata:0x80000000 qpea
ldedit -bmaxdata:0x80000000 qpmon
ldedit -bmaxdata:0x80000000 qpcfg
All agentpref commands are based in Linux/UNIX examples for Windows; use agentpref.bat with the same command parameters.