Troubleshooting JVM Blocking issues



When you observe issues in performance of the AR System server due to blocking of the Java Virtual Machine (JVM), use this information to either perform the appropriate troubleshooting steps to resolve the problem or create a BMC Support case.


Related topics

Symptoms

  • End users are unable to login. Users receive ARERR 91 RPC errors. Frequency of the errors can be intermittent. 
  • Server is unresponsive even when the process is running.
  • Thread dumps show “pool-“ AR threads in BLOCKED/WAITING or RUNNABLE with “locked by” object.

Scope

  • Multiple users are affected. But if the problem is related to a private RPC queue used by a specific client, the problems are limited to that part of the system.
  • The problem may be intermittent but occurs more frequently during periods of high activity such as the start of the working day when many users begin to use the system.
  • In a server group environment, this could affect any number of servers in the group. One server, many servers, or all servers can be impacted.

Diagnosing and reporting an issue

Step

Task

Description

1

Capture thread dumps and application snapshot when the symptom occurs. 

Perform the following steps immediately when you experience the errors, otherwise skip to step 3.

  1. Follow the instructions in this knowledge article to capture ARServer heap dumps, thread dumps and application snapshot.

  2. Collect 3-5 heap dumps 10 to 30 seconds apart.
  3. Collect 3-5 thread dumps 10 to 30 seconds apart per step 12 of the article.
  4. Take an application snapshot as per step 14 mentioned in the article. 

2

Examine the thread dump


    • Open the Thread Dump with a Text Editor

AR System server client threads starts with "pool- ".


    • Examine the class names at the top of the stack where thread name starts with “pool-“.
      Search for “locked” keyword.

For example:pool-List-thread-8

threadId:536 - state:RUNNABLE
stackTrace:
java.lang.Thread.State: RUNNABLE

at java.net.SocketInputStream.read(Unknown Source)

at com.microsoft.sqlserver.jdbc.TDSChannel.read(IOBuffer.java:1877)

at com.microsoft.sqlserver.jdbc.TDSReader.readPacket(IOBuffer.java:6642)

- locked <3b35c72e> (a com.microsoft.sqlserver.jdbc.TDSReader)

at com.microsoft.sqlserver.jdbc.TDSCommand.startResponse(IOBuffer.java:7959)

For more information on how to read thread dumps, you can refer to this external topic https://dzone.com/articles/how-to-read-a-thread-dump

If you find "Locked" lines in the thread dump, search for class name near "locked" in that thread's stack. See the following Solution Table and search for the specific class name.

If you did not find "Locked" lines in the thread dump, proceed to the next step.

3

Enable diagnostics

If the preceding steps did not resolve the problem, collect and analyze more diagnostic information. This information can be used with the following Solution Table to identify a solution or to create a case with BMC Support.

Enable API, Filter, and SQL Logging. If the problem occurs on more than one server, enable logging on all the affected servers. To enable logs on multiple servers in a server group, use the Server Group Logging option from the Server Group Admin Console.

4

Reproduce the problem or wait for it to occur

Usually, JVM blocking conditions do not occur often enough or with enough consistency. We cannot predict when they will occur.  

Reproduce the JVM blocking condition if possible. Otherwise, wait until the issue occurs on its own.

5

Collect JVM diagnostics

Follow the preceding Step 1 to collect heap dumps, threads dumps, and application snapshot on any server that is experiencing the blocking.

6

Create a BMC Support Case

Collect and send logs and detailed information when creating a case with BMC Support:
1. Provide the following information as part of your case:

  • User name of users that reported the problem.
  • Activities that the users were performing when the blocking occurred.
  • Time and Date of blocked activity.
  • Server Name (if known) that user was connected to or that was known to be blocked.

2. Run the log zipper utilityon each affected indexer server:

  • Select Zip Logs, (not Zip All Logs).
    Note: If you will be providing the Log Zipper file from multiple servers, rename the zip file to include the server name as part of the filename.

3. Attach the zip file to your case (up to 2 GB) or transfer the files to BMC using FTP. For more information, see Steps to send logs, files, screenshots, etc to BMC Support for a Remedy Product related case.

Error messages and resolution

Error

Error message

Resolution

Threads blocked or waiting in class ch.qos.logback or com.bmc.arsys.logging indicates file system locking.

Thread Dump:

"pool-List-thread-20" - Thread t@535
   java.lang.Thread.State: BLOCKED
     at ch.qos.logback.core.AppenderBase.doAppend(AppenderBase.java:64)
     - waiting to lock <24d04176> (a com.bmc.arsys.logging.SuspendableRollingFileAppender) owned by "pool-List-thread-4" t@519
 

    at ch.qos.logback.core.spi.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:48)
     at ch.qos.logback.classic.Logger.appendLoopOnAppenders(Logger.java:273)
     at ch.qos.logback.classic.Logger.callAppenders(Logger.java:260)
     at ch.qos.logback.classic.Logger.buildLoggingEventAndAppend(Logger.java:442)
     at ch.qos.logback.classic.Logger.filterAndLog_1(Logger.java:414)
     at ch.qos.logback.classic.Logger.debug(Logger.java:507)
     at com.bmc.arsys.server.logger.ContextLogger.debug(ContextLogger.java:123)
     at com.bmc.arsys.server.domain.runtime.impl.FilterExecutorImpl.beginEndFilterPhase(FilterExecutorImpl.java:2767)

Server, plugin or API client calls may slow down if IO is slow.

Deactivate unnecessary logging and use buffered logging.

Threads blocked or waiting in class oracle.jdbc.driver name indicates AR System server waiting on database. 

Thread Dump:

pool-Fast-thread-6" #394 prio=5 os_prio=0 tid=0x00007f2238840000 nid=0xb7c2 runnable [0x00007f2193b47000]
   java.lang.Thread.State: RUNNABLE

                at java.net.SocketInputStream.socketRead0(Native Method)

                at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)

                at java.net.SocketInputStream.read(SocketInputStream.java:171)

                at java.net.SocketInputStream.read(SocketInputStream.java:141)

                at oracle.net.ns.Packet.receive(Packet.java:308)

                at oracle.net.ns.DataPacket.receive(DataPacket.java:106)

                at oracle.net.ns.NetInputStream.getNextPacket(NetInputStream.java:324)

                at oracle.net.ns.NetInputStream.read(NetInputStream.java:268)

                at oracle.net.ns.NetInputStream.read(NetInputStream.java:190)

                at oracle.net.ns.NetInputStream.read(NetInputStream.java:107)

                at oracle.jdbc.driver.T4CSocketInputStreamWrapper.readNextPacket(T4CSocketInputStreamWrapper.java:124)

                at oracle.jdbc.driver.T4CSocketInputStreamWrapper.read(T4CSocketInputStreamWrapper.java:80)

                at oracle.jdbc.driver.T4CMAREngine.unmarshalUB1(T4CMAREngine.java:1137)
                at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:350)
                at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:227)
                at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:531)
                at oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:195)
                at oracle.jdbc.driver.T4CStatement.executeForDescribe(T4CStatement.java:876)
                at oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1175)
                at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1296)
                at oracle.jdbc.driver.OracleStatement.executeQuery(OracleStatement.java:1498)

                - locked <0x00000006bf0a9cb0> (a oracle.jdbc.driver.T4CConnection)

AR System may present performance issues when the underlying problem is with the database. The AR System server is heavily dependent on the database. 

Capture API, SQL logs and review the arexception.log file.

Threads blocked or waiting in class  com.microsoft.sqlserver.jdbc indicates AR System server waiting on database.

Thread dump:

pool-List-thread-8
threadId:536 - state:RUNNABLE
stackTrace:
java.lang.Thread.State: RUNNABLE

at java.net.SocketInputStream.socketRead0(Native Method)

at java.net.SocketInputStream.socketRead(Unknown Source)

at java.net.SocketInputStream.read(Unknown Source)

at java.net.SocketInputStream.read(Unknown Source)

at com.microsoft.sqlserver.jdbc.TDSChannel.read(IOBuffer.java:1877)

at com.microsoft.sqlserver.jdbc.TDSReader.readPacket(IOBuffer.java:6642)

- locked <3b35c72e> (a com.microsoft.sqlserver.jdbc.TDSReader)

at com.microsoft.sqlserver.jdbc.TDSCommand.startResponse(IOBuffer.java:7959)
at com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.doExecutePreparedStatement(SQLServerPreparedStatement.java:438)
at com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement$PrepStmtExecCmd.doExecute(SQLServerPreparedStatement.java:385)
at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:7505)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:2445)
- locked <3d47a720> (a java.lang.Object)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLServerStatement.java:191)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLServerStatement.java:166)
at com.microsoft.sqlserver.jdbc.SQLServerPreparedStatement.executeQuery(SQLServerPreparedStatement.java:297)
at org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
at org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
at org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
at net.sf.log4jdbc.PreparedStatementSpy.executeQuery(PreparedStatementSpy.java:735)

AR System may present performance issues when the underlying problem is with the database. The Developer Studio is heavily dependent on the database. 

Capture API, SQL logs and review the arexception.log file.

Threads blocked or waiting in class  DefaultMessageListenerContainer indicates a problem with Server Group Coordinator node.

- waiting to lock <68edc27a> (a java.util.concurrent.ConcurrentHashMap) owned by "DefaultMessageListenerContainer-1" t@457

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*