Troubleshooting performance latency issues


Latency in BMC HelixGPT's performance might occur due to the following reasons:

  • Changes in parameter values in the default skill configuration.
  • A proxy or gateway is configured between BMC HelixGPT and the LLM provider.
  • Changes in the LLM model or client configuration parameters.

Issue symptoms

End users experience latency in responses from BMC HelixGPT.  Sometimes,BMC HelixGPT takes more than a minute to generate a response.

Issue scope

This issue can occur in the following scenarios:

  • Change in the default skill configuration parameters.
  • A proxy or gateway is configured between BMC HelixGPT and the LLM provider.
  • Lack of optimization of client and LLM model configuration parameters for specific requirements.

Resolution

  • For agentic skills, we recommend that the parameter value for numberOfDocumentsToReturn is set to 5 or less.
    Increasing the default value can sometimes affect performance.  For more information, see Updating the configuration parameters of a skill.
  • If a proxy or gateway is configured between BMC HelixGPT and the LLM provider, make sure to allocate sufficient memory and CPU to the proxy server, optimize its configuration, and enable resource usage monitoring.
  • Optimize the client and LLM model configuration parameters for your specific requirements. To optimize the model configurations, based on your environment requirements, perform any or all of the following steps:

    Add custom headers

    This configuration can be added to all LLM models, such as Azure, OpenAI, Llama, and Gemini, to improve streaming performance when a proxy or gateway is configured between BMC HelixGPT and the LLM model.

    Add the following configurations to the model's default configurations. For more information about updating skill configurations, see Updating the configuration parameters of a skill.


    "customHeaders": [
     {"name": "X-Accel-Buffering", "value": "no"},
     {"name": "Cache-Control", "value": "no-cache, no-store"},
     {"name": "Connection", "value": "keep-alive"}
    ]

    Enable custom HTTP client

    This configuration can be added only to Azure OpenAPI models to improve network stability and performance by optimizing timeouts, connection pooling, and persistent (keep-alive) connections.

    Add the following configurations to the model's default configurations:


    "httpxClientEnabled": true

    The following default settings are configured in a model when the custom HTTPS client is enabled. You can customize these settings as per your requirements. However, we recommend that you do not change the default configurations.


    "httpxClientEnabled": true
    "httpxClientConfig": {
    "readTimeout": 300.0,
    "connectTimeout": 120.0,
    "writeTimeout": 120.0,
    "poolTimeout": 120.0,
    "maxConnections": 100,
    "maxKeepaliveConnections": 20,
    "keepaliveExpiry": 30.0,
    "http2Enabled": false,
    "followRedirects": true
    }

    Increase HTTP Connection Pool Max Size

    The default value for this configuration is 10. But, for high-concurrency environments, we recommend setting the value to 50. To optimize the system configurations, increase the HTTP Connection Pool Max Size.

    Add the following configuration to your .env file and restart the application for the updates to take effect.

      #.env file
      HTTP_CONNECTION_POOL_MAX_SIZE=50 

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*

BMC HelixGPT 26.1