LLM library
This topic describes how to use the LLM library to install, configure, and manage supported large language models (LLMs) directly from the user interface (UI). Using the LLM library eliminates the need for manual setup through command-line tools.
The key features of the LLM library are as follows:
- Installation—Easily install preconfigured LLMs.
- Model selection—Select from various LLMs supported by BMC AMI AI Services.
- Version control and updates—View available versions and update models with ease.
The LLM library has the following benefits:
- Faster deployment—Get started with LLMs without technical hurdles.
- User-friendly experience—Intuitive UI simplifies model installation and management.
- Optimized performance—Configure models for best efficiency on your hardware.
The LLM library displays cards with the following details about the available LLMs:
| Field | Description | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Runs on | Shows whether the LLM runs on GPU or CPU. LLM can run on CPU and GPU, but CPU images are not supported by the LLM library. | |||||||||||||||
| Deployed on | Shows the host address where the LLM deployment is triggered. If the LLM has one of the following statuses:
| |||||||||||||||
| Deployed | Shows whether the LLM is deployed manually or through the library. You can deploy the LLM automatically or manually through the UI. | |||||||||||||||
| Status | Shows one of the following LLM statuses:
| |||||||||||||||
| Integrations | Shows the supported integrations. | |||||||||||||||
| Action button | The action button at the bottom of the card depends on the LLM status. For more information about the actions, follow the link in the Reference column:
|
Deploying LLMs through LLM library
Before deploying LLM you must configure kubectl access and environment for RKE2 Kubernetes cluster. For more information, see Configure kubectl access and environment for RKE2 Kubernetes cluster.
To deploy an LLM by using automated deployment
- Sign in to BMC AMI Platform using your credentials.
- From the menu in the left pane, click BMC AMI AI Manager > LLM library.
- In the LLM card, click Deploy. The Deployment details dialog box is displayed.

- In the Machine information section, select the machine name.
- In the LLM engine settings section, follow these steps:
- In the Number of GPUs field, enter the number of GPUs. To find the maximum number of available GPUs, run the following command on the GPU node:nvidia-smi --query-gpu=name --format=csv,noheader | wc -l
- In the CPU KVCache space field, specify the memory allocated for the LLM KV cache. Higher values allow more parallel requests. You must set the values according to the following table:
System configuration Default value g5.12xlarge 186 Standard_NC24ads_A100_v4 210 On-premises machine You need set up the value according to your system memory.
For example: If the available memory is 64 GB, the value would be 44 (64 - 20).
- In the Number of GPUs field, enter the number of GPUs. To find the maximum number of available GPUs, run the following command on the GPU node:
- Verify all the details that you have entered and click Deploy.
The deployment process starts in the background. The total time for deployment depends on the network bandwidth and other factors.
After a successful deployment, the LLM appears on the LLM Settings page. To use it in an integration, navigate to Integration Settings and update the integration to include the LLM. While deployment is in progress, the LLM card displays Deploying status, and no other LLM deployment is allowed.
Updating LLMs
If a deployed LLM fails or requires an update, you can update it via the LLM library. You can update an LLM only if it was deployed as an automated deployment via the LLM library.
- In the LLM card in the LLM library, click Update.
- Follow the steps added in the Deploying LLM section.
- Verify all the details that you have entered and click Deploy.
The deployment process starts in the background. The total time for deployment depends on the network bandwidth and other factors.
After a successful deployment, the LLM appears on the LLM Settings page. To use it in an integration, navigate to Integration Settings and update the integration to include the LLM.
While deployment is in progress, the LLM card displays the Deploying status, and no other LLM deployment is allowed.
Revoking LLMs
If an LLM that was deployed by automated deployment via the LLM library is no longer required, you can revoke it via the LLM library.
The revocation process reverts changes made during deployment, including the removal of the container from the host. The model files downloaded during deployment remain unchanged.
After revocation, the LLM is no longer visible or available for use in any integration.
To revoke an LLM, in the LLM card in the LLM library, click Revoke. Read the warning message and click Revoke to confirm and proceed. When the revocation process is complete, the LLM is no longer visible or available for use in any integration, and some features might not work as expected.
Deleting LLMs
If an LLM was deployed manually, you can delete it instead of revoking it.
To delete an LLM, click Delete in the LLM card. A warning appears, stating that deleting the LLM will cause any integrations using it to stop working. Read the warning message and click Delete.
When the deletion process ends, the LLM is no longer visible or available for use in any integration.