Troubleshooting
This troubleshooting section is intended to assist in identifying and resolving potential issues.
| Issue | Solution | ||||||
|---|---|---|---|---|---|---|---|
| Unable to get a response when the code exceeds 400 lines | This issue occurs because the engine was built without specifying the max_num_tokens parameter, which defaults to a lower value. To resolve it, rebuild the engine using the following command, with the parameter added: trtllm-build --checkpoint_dir ${UNIFIED_CKPT_PATH} \ --gemm_plugin float16 \ --output_dir ${ENGINE_PATH} \ --max_num_tokens 32768 Here, 32768 represents context length for Mixtral since they are using BYOLLM feature with Mixtral deployed on the Triton server. | ||||||
If the load-embeddings job is in a failed state. kubectl get jobs -n bmcami-prod-amiai-services
| Execute the following commands to restart job. kubectl delete job load-embeddings -n bmcami-prod-amiai-services helm upgrade amiai-services /<extracted_dir>/BMC-AMI-PLATFORM-2.0.00/helm_charts/07-helm-amiai-chart/ --namespace bmcami-prod-amiai-services --reuse-values | ||||||
| When you get the following error in BMC AMI Chat Assistant: The Assistant service is currently unavailable. Try again. If the problem persists, contact your BMC AMI Platform administrator. |
To check AI services health:
| ||||||
Issue: “READONLY You can't write against a read only replica”. You might encounter the following error message on the Uptrace UI during login: READONLY You can't write against a read only replica. |
| ||||||
| The CES instance isn't launching from BMC AMI Platform. | Make sure that your CES host is running and is accessible via HTTPS. Verify the host connectivity by clicking Test connection. | ||||||
| You can't add a CES instance. |
| ||||||
| Authentication fails when you're adding a CES instance. | Confirm that CES credentials are correct. For CES versions earlier than 23.04.06, you must enter credentials each time. | ||||||
| The CES version is not displayed during the setup. | Make sure that the CES instance is running and accessible. The version number appears only after successful authentication. | ||||||
| The CES instance is displayed as unavailable. | Click Test connection to check host status. If required, restart the CES host. | ||||||
| An HTTPS requirement error has occurred. | CES must use HTTPS for integration with BMC AMI Platform. Update CES configuration to enable HTTPS. | ||||||
| Credentials not saved for future access. | Upgrade to CES version 23.04.06 or 24.05.01 to enable credential storage in the BMC AMI platform database. BMC AMI Platform natively supports CES versions 23.04.06 and later modifications levels within the 23.04 release, as well as 24.05.01 and later. When you add a CES instance by using these versions, the credentials you enter are securely stored in the database and automatically reused for future access. BMC AMI Platform also supports CES version 20.15.03 or later, but you must enter your credentials each time you access CES from BMC AMI Platform. | ||||||
| The deployment completed, but the container images could not be pulled because they do not exist in the repository. | Remove the deployment by using the following teardown script and run the deployment again: #!/bin/bash # Kubernetes namespace cleanup script # This script cleans up Helm releases and all resources in specified namespaces # Define namespaces namespaces="bmcami-prod-user-management bmcami-prod-amiai-services bmcami-prod-data-service bmcami-prod-observability" echo "===== Deleting Helm releases (if any) =====" for ns in $namespaces; do echo "Namespace: $ns" releases=$(helm list -n "$ns" --short) if [ -n "$releases" ]; then echo "$releases" | xargs -r -I{} helm uninstall {} -n "$ns" else echo "No Helm releases found in namespace $ns" fi done echo "===== Deleting all resources in those namespaces =====" for ns in $namespaces; do echo "Cleaning namespace: $ns" kubectl delete all --all -n "$ns" --ignore-not-found=true kubectl delete pvc --all -n "$ns" --ignore-not-found=true kubectl delete secret --all -n "$ns" --ignore-not-found=true kubectl delete configmap --all -n "$ns" --ignore-not-found=true kubectl delete rolebinding,role,serviceaccount,networkpolicy,ingress -n "$ns" --all --ignore-not-found=true done echo "===== Deleting the namespaces =====" kubectl get namespace --no-headers -o custom-columns=:metadata.name | grep -E "bmcami-prod-(user-management|amiai-services|data-service|observability)" | xargs -r kubectl delete namespace echo "===== Deleting related PVs =====" kubectl get pv --no-headers -o custom-columns=:metadata.name | grep -E "bmcami-prod-(user-management|amiai-services|data-service|observability)" | xargs -r kubectl delete pv echo "===== Cleaning up NFS storage =====" if [ -d "/mnt/nfs" ]; then echo "Deleting all contents in /mnt/nfs/" sudo rm -rf /mnt/nfs/* if [ $? -eq 0 ]; then echo "NFS storage cleaned successfully" else echo "Failed to clean NFS storage" fi else echo "NFS directory /mnt/nfs/ does not exist" fi echo "===== Verifying cleanup =====" for ns in $namespaces; do echo "Checking $ns..." if kubectl get namespace "$ns" &>/dev/null; then echo "Namespace $ns still exists" kubectl get all -n "$ns" 2>/dev/null || echo "No resources found in namespace $ns" else echo "Namespace $ns fully removed." fi done remaining_pvs=$(kubectl get pv --no-headers -o custom-columns=:metadata.name 2>/dev/null | grep -E "bmcami-prod-(user-management|amiai-services|data-service|observability)" || true) if [ -n "$remaining_pvs" ]; then echo "Remaining PVs:" echo "$remaining_pvs" else echo "No PVs remaining." fi echo "===== Cleanup completed =====" rm -rf /mnt/nfs |
BMC AMI AI knowledge Hub—Common errors
This section helps you identify and resolve common errors in the BMC AMI AI knowledge hub service.
You can use the error code in the product message to navigate directly to the matching troubleshooting entry.
How to find the asset ID
The system often requires the asset ID when you report an error or when an administrator searches logs.
- From API responses—Open the developer tools in the UI, navigate to the Network tab, and locate the upload, asset list, or asset details API. The response body includes asset_id.
- From logs—The system might log the asset ID when an error occurs. Search the logs by file name or by the approximate upload time to locate the relevant entries.
Quick lookup
| Error code | Issue | Area |
|---|---|---|
| AAPKNW001E | Unsupported file type | Upload |
| AAPKNW002E | File too large | Upload |
| AAPKNW003E | Asset deleted | Publish |
| AAPKNW007E | File no longer exists | Publish |
| AAPKNW009E | Asset not found | General |
| AAPKNW011E | Operation in progress | General |
| AAPKNW021E | Service unavailable | Platform |
| AAPKNW025E | Empty document | Publish |
| AAPKNW026E | Encrypted document | Publish |
| AAPKNW027E | Database update failed | Publish / Unpublish |
| AAPKNW028E | No valid text | Publish |
| AAPKNW029E | Upload failed | Upload |
| AAPKNW030E | Embedding service issue | Embedding |
| AAPKNW031E | Embedding service issue | Embedding |
| AAPKNW032E | OCR service issue | OCR |
| AAPKNW033E | OCR service issue | OCR |
| AAPKNW034E | Link expired | Download link |
| AAPKNW035E | Empty file | Upload |
| AAPKNW036E | Unrecognized file type | Upload |
| AAPKNW037E | Unexpected publish error | Publish |
| AAPKNW038E | Document not supported | Publish |
| AAPKNW040E | Unpublish failed | Unpublish |
| AAPKNW041E | Delete failed | Delete |
| AAPKNW042E | Cancel publish failed | Cancel |
| AAPKNW043E | AI visibility update failed | AI visibility |
Troubleshooting entries
| Issue | Solution |
|---|
|
End user Administrator |
|
End user Administrator |
|
End user Administrator |
|
End user Administrator |
|
End user Administrator |
|
End user Administrator |
|
End user Administrator |
|
End user Administrator |
|
End user Administrator |
|
End user Administrator |
|
End user Administrator |
|
End user Administrator |
|
End user Administrator |
|
End user Administrator |
|
End user Administrator |
|
End user Administrator |
|
End user Administrator |
|
End user Administrator |
|
End user Administrator |
|
End user Administrator |
|
End user Administrator |
|
End user Administrator |
|
End user Administrator |
|
End user Administrator |
|
End user Administrator |
BMC AMI AI OCR service—Troubleshooting
This topic helps you identify and resolve issues in the BMC AMI AI OCR service.
You can use the error code in the API response to locate the matching troubleshooting entry.
Error codes quick lookup
| Error code | Message | HTTP status |
|---|---|---|
| AAPOCR001E | The OCR operation failed. | 500 |
| AAPOCR002E | The input file was not found. | 404 |
Troubleshooting entries
| Error code | Issue | End user | Administrator |
|---|---|---|---|
AAPOCR002E | The input file was not found | Verify that the file name matches the uploaded file and includes the .pdf extension. | Verify that the file exists in the input directory. Check the DATA_DIR_PATH and volume mounts. |
AAPOCR001E | The OCR operation failed | Try a different file or retry the request | Check service logs using the file name and timestamp. Review the stderr output for OCR errors. |
| HTTP 504 | The request timed out | Try a smaller or shorter document | Review OCR_SERVICE_TIMEOUT_S and increase it if needed. Check the logs for OCR Process Timed Out. |
| HTTP 422 | The request is invalid | Verify that the request includes a valid file_name and a boolean force_ocr value. | Verify API input validation and request payload structure. |
| HTTP 200 (ocr: false) | OCR was skipped because text exists | No action is required unless OCR is needed. Set the force_ocr to true. | This behavior is expected. |
| HTTP 200 (backend failure) | The OCR backend failed | Contact your administrator | Check logs for exit code 7 and stderr output. Investigate OCR engine errors. |
Validation messages
| Field | Message |
|---|---|
| file_name | The file name is required. |
| force_ocr | The force_ocr value must be a boolean. |
Log-based troubleshooting
| Log message | Action |
|---|---|
| OCR Process Timed Out | Increase the OCR_SERVICE_TIMEOUT_S and review the document size and system resources. |
| exit code 6 | The file already contains text. No action is required. |
| exit code 7 | Review stderr output and investigate OCR engine errors. |
| exit code 2 (DigitalSignatureError) | The PDF contains digital signatures. Review stderr output and retry with a supported file. |
| exit code N | Review stderr output and OCR engine documentation. |
| File not found | Verify the DATA_DIR_PATH file location, and permissions. |
Resource and performance issues
| Issue | Action |
|---|---|
| The OCR service runs out of memory | Increase the memory limit in the deployment configuration or use smaller PDF files. |
Administrator checklist
| Check | How to verify |
|---|---|
| The service is running | Call GET /health and verify that it returns 200. |
| The log file is available | Verify the {LOG_DIR_PATH}/{LOG_FILE_NAME}.log. |
| The input directory exists | Verify the DATA_DIR_PATH and list files. |
| The output directory is writable | Verify the OCR_WRITE_PATH permissions. |
| The timeout configuration is correct | Review the OCR_SERVICE_TIMEOUT_S configuration. |
Quick log search
| Issue | Search for |
|---|---|
| Timeout | OCR Process Timed Out |
| File not found | not found |
| OCR failure | AAPOCR001E |
| Backend failure | exit code 7 |
| Already processed | exit code 6 |
| Error details | Stderr: |
Support
If the issue persists, collect the following information:
- The error code and HTTP status
- The file name and timestamp
- Relevant log entries
- Environment details
Contact your BMC AMI Platform administrator or support team with this information.