LLM library


This topic describes how to use the LLM library to install, configure, and manage supported large language models (LLMs) directly from the user interface (UI). Using the LLM library eliminates the need for manual setup through command-line tools.

The key features of the LLM library are as follows:

  • Installation—Easily install preconfigured LLMs.
  • Model selection—Select from various LLMs supported by BMC AMI AI Services.
  • Version control and updates—View available versions and update models with ease.

The LLM library has the following benefits:

  • Faster deployment—Get started with LLMs without technical hurdles.
  • User-friendly experience—Intuitive UI simplifies model installation and management.
  • Optimized performance—Configure models for best efficiency on your hardware.

The LLM library displays cards with the following details about the available LLMs:

FieldDescription
Runs onShows whether the LLM runs on GPU or CPU. LLM can run on CPU and GPU, but CPU images are not supported by the LLM library.
Deployed onShows the host address where the LLM deployment is triggered. If the LLM has one of the following statuses:
  • Deployed
  • Deploying
  • Error
DeployedShows whether the LLM is deployed manually or through the library. You can deploy the LLM automatically or manually through the UI.
StatusShows one of the following LLM statuses:
  • New—LLM is new and unused
  • Deploying—LLM is deploying
  • Deployed—LLM has been successfully deployed
  • Inactive — LLM is not being used
  • Error—An error occurred in LLM deployment.
  • Update required—LLM has been revised in the latest version and needs to be updated before you use it.
IntegrationsShows the supported integrations.
Action button

The action button at the bottom of the card depends on the LLM status. For more information about the actions, follow the link in the Reference column:

ActionStatusReference
DeployLLM is in New or Revoked status.For more information, see Deploying LLMs.
UpdateLLM is in Error status or requires an update.For more information, see Updating LLMs.
 
RevokeLLM has been deployed and needs to be revoked.For more information, see Revoking LLMs.
DeleteLLM has been deployed manually and needs to be removed.For more information, see Deleting LLMs.

Deploying LLMs through LLM library

Before deploying LLM you must configure kubectl access and environment for RKE2 Kubernetes cluster. For more information, see Configure kubectl access and environment for RKE2 Kubernetes cluster.

To deploy an LLM by using automated deployment

  1. Sign in to BMC AMI Platform using your credentials. 
  2. From the menu in the left pane, click BMC AMI AI Manager > LLM library.
  3. In the LLM card, click Deploy. The Deployment details dialog box is displayed.

1757691282220-265.png

  1. In the Machine information section, select the machine name.
  2. In the LLM engine settings section, follow these steps:
    1. In the Number of GPUs field, enter the number of GPUs. To find the maximum number of available GPUs, run the following command on the GPU node:
      nvidia-smi --query-gpu=name --format=csv,noheader | wc -l
    2. In the CPU KVCache space field, specify the memory allocated for the LLM KV cache. Higher values allow more parallel requests. You must set the values according to the following table:
      System configurationDefault value
      g5.12xlarge186
      Standard_NC24ads_A100_v4210
      On-premises machine

      You need set up the value according to your system memory.

      For example: If the available memory is 64 GB, the value would be 44 (64 - 20).

  3. Verify all the details that you have entered and click Deploy

The deployment process starts in the background. The total time for deployment depends on the network bandwidth and other factors.

After a successful deployment, the LLM appears on the LLM Settings page. To use it in an integration, navigate to Integration Settings and update the integration to include the LLM. While deployment is in progress, the LLM card displays Deploying status, and no other LLM deployment is allowed.

Updating LLMs

If a deployed LLM fails or requires an update, you can update it via the LLM library. You can update an LLM only if it was deployed as an automated deployment via the LLM library. 

  1. In the LLM card in the LLM library, click Update
  2. Follow the steps added in the Deploying LLM section.
  3. Verify all the details that you have entered and click Deploy.

The deployment process starts in the background. The total time for deployment depends on the network bandwidth and other factors. 

After a successful deployment, the LLM appears on the LLM Settings page. To use it in an integration, navigate to Integration Settings and update the integration to include the LLM. 

While deployment is in progress, the LLM card displays the Deploying status, and no other LLM deployment is allowed. 

Revoking LLMs

If an LLM that was deployed by automated deployment via the LLM library is no longer required, you can revoke it via the LLM library.

The revocation process reverts changes made during deployment, including the removal of the container from the host. The model files downloaded during deployment remain unchanged.   

After revocation, the LLM is no longer visible or available for use in any integration. 

To revoke an LLM, in the LLM card in the LLM library, click Revoke. Read the warning message and click Revoke to confirm and proceed. When the revocation process is complete, the LLM is no longer visible or available for use in any integration, and some features might not work as expected.

Deleting LLMs

If an LLM was deployed manually, you can delete it instead of revoking it.

Warning
Important

Before deleting LLM, you must undeploy the manually deployed LLM.

To delete an LLM, click Delete in the LLM card. A warning appears, stating that deleting the LLM will cause any integrations using it to stop working. Read the warning message and click Delete.

When the deletion process ends, the LLM is no longer visible or available for use in any integration. 

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*

BMC AMI Platform 2.0