Tomcat container workload configuration

The default installation of Tomcat sets the maximum number of HTTP servicing threads at 200. Effectively, this means that the system can handle a maximum of 200 simultaneous HTTP requests. When the number of simultaneous HTTP requests exceeds this count, the unhandled requests are placed in a queue, and the requests in this queue are serviced as processing threads become available. This default queue length is 100. At these default settings, a large web load that can generate over 300 simultaneous requests will surpass the thread availability, resulting in service unavailable (HTTP 503).

You should configure Tomcat to handle your planned workload. The maximum number of processing threads depends on your hardware capability. One method to determine when you have reached the maximum capacity of the hardware is to monitor the JVM runtime behavior to see the JVM CPU/heap utilization.

The following configuration values are used for request processing threads in Tomcat for a given connector:

maxThreads — The maximum number of request processing threads that a given connector creates.
acceptCount — The maximum queue length for incoming connection requests to the given connector when all possible request processing threads are in use.

You can configure multiple connectors per Tomcat instance, each running a different protocol such as AJP, HTTP, or HTTPS.

Tomcat 8 HTTP connector reference

https://tomcat.apache.org/tomcat-8.0-doc/config/http.html

Under ideal conditions, these configuration values should not need user configuration. The ideal software should service as many simultaneous incoming requests as possible and queue up as many requests as necessary until the limit of service is reached. This is simple to state yet impossible to implement because a system’s runtime behavior depends on many external variables.

This topic contains the following information:

maxThreads

View the maxThreads value as a cap on runaway conditions. For example, if the AR System server is unresponsive, and as requests arrive, each processing thread is in use and is in an unavailable state until eventually all allocated threads are in use and are unavailable. If no cap is in place, eventually the Tomcat process (or the JVM process hosting the Tomcat) would crash from creating too many processing threads.

Based on BMC load tests, a balanced value for the maxThreads parameter is twice the number of expected maximum concurrent users for your planned workload. Current browsers allow multiple TCP socket connections to a given domain server name. If the user interactions with the mid tier are fairly consistent, at twice the maximum concurrent users, this value provides enough spare threads to reasonably handle unexpected surges. However, you can increase this value if your use cases require long-running BMC Remedy AR System API calls (which ties up the service threads for longer) or if you have heads-down type users such as at a call center (which generates a higher number of simultaneous HTTP requests).

In practice, for BMC Remedy AR System platform deployment, the number of actual processing threads rarely reaches the maxThreads value. When it does, it is because of other issues (such as an unresponsive AR Server or poor load planning). To view the maximum number of service threads created over a monitored interval using the JVM monitor data. In jvisualvm, click the Threads tab and the Table tab, then sort by Thread.

Note

For a deployment with a larger number of concurrent users, the guideline of setting to maxThreads twice as many users might be too much; this rule is non-linear. For example, if you planned for 30 concurrent users, the probability that all 30 users are accessing the mid tier simultaneously is high. However, if you planned for 3,000 concurrent users, the probability of all 3,000 users accessing the mid tier simultaneously is almost zero. For higher concurrent users load, you may taper off your allocation of HTTP service threads.

The following figure shows that over the monitored interval, for the running load, the JVM created 213 HTTP service threads for the SSL (or HTTPS) connector on port 443. If the maximum number of service threads was reached and users experienced no error conditions, revisit your load planning because this is a symptom of a web instance reaching its maximum service capacity. If the users experienced errors during the monitored interval, look in the Tomcat logs for the conditions that caused the processing threads creation to reach its maximum limit.

Number of processing threads created

acceptCount

View the acceptCount value as the buffer size for smoothing out surges in incoming HTTP requests. Any spillover requests (ones without available service threads) from the maximum simultaneously serviceable requests are placed in this queue. Any request in this queue requires longer service time because the user-experience service time for the request is the sum of the actual service time spent by the processing thread plus the time the request spent in the queue.

For service-time critical deployments, set this queue to a smaller value (such as 100). For a highly fluctuating web load, set this queue to a higher value (such as 250).

Other connector properties

Other Tomcat connector thread properties affect the performance of the web application to a lesser degree. These properties include acceptorThreadCount and acceptorThreadPriority.

The mid tier uses UTF-8 encoding for requests and responses, so include this encoding in the connector property. If you have multiple languages deployment, include UTF-8 encoding.

Tomcat thread configuration example

In the following example, the web stack instance is designed to support a maximum of 300 concurrent with the HTTP request buffer queue set at 100 and the UTF-8 encoding set. The maxThreads parameter is set twice the number of maximum concurrent users.