Deployed large language model (LLM)
Related topic
This topic describes how to view the deployed large language models (LLMs) and their details in one place. You can also add a local LLM.
To view the deployed LLMs
- Sign in to BMC AMI Platform by using your credentials.
- Click the Platform manager tab.
- From the menu in the left pane, select BMC AMI AI Manager > AI services settings > Deployed LLMs.
The Deployed Large Language model (LLM) window lists all the deployed LLMs.
To add a local model
- On the upper left of the LLM settings window, click Add local model.
- In the Add local large language model dialog box, select your model type.
Select the Public API or the Self-hosted tab, depending on your model type.
Follow these steps:
- Select the hosted platform.
- Enter an API access key.
- Click Test connection.
- If a success confirmation message is displayed, click Next>>. (If a failure message is displayed, verify that the provided API access key is valid.)
Follow these steps:
Complete the following fields:
Field
Description
Inference engine
Select the inference engine you want to use. We support vLLM, Triton, and an OpenAI-compatible inference server.
Base URL
Enter your inference engine base URL.
For example:- Triton- http://my-triton-server:4000/v1
- OpenAI Compatible-
https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/
API access key
(Optional) Key to access the LLM server API if configured in the LLM server
- Click Test connection.
- If a success confirmation message is displayed, click Next>>. (If a failure message is displayed, verify that the provided details are valid.)
- On the Step 2: Add details and save tab, add the on-premises LLM server details to complete the following fields as follows:
Field
Description
Model
You can select a model from the Model details list.
Version
Version of the LLM.
Display name
(Optional) Model in BMC AMI AI Management Console
Description
Description of the LLM
Supported integrations
(Optional) Select the supported integration.
Include this LLM in the BMC AMI Assistant chat LLM list Enable this option if you want to use the selected LLM in the BMC AMI Assistant chat list. Maximum number of tokens
Non-zero positive integer representing the maximum number of tokens configured for the model on the on-premises LLM server. A minimum of 100,000 tokens is required for BMC AMI Chat Assistant.
- Click Save.
A success confirmation message is displayed. If this procedure fails, the user interface displays an error message with the reason for the failure. Once you have added the LLM, see Integrations settings and BMC AMI Assistant chat settings to use the LLM.