Deployed large language model (LLM)

Related topic

This topic describes how to view the deployed large language models (LLMs) and their details in one place. You can also add a local LLM.

To view the deployed LLMs

Sign in to BMC AMI Platform by using your credentials.
Click the Platform manager tab.
From the menu in the left pane, select BMC AMI AI Manager > AI services settings > Deployed LLMs.
The Deployed Large Language model (LLM) window lists all the deployed LLMs.

To add a local model

On the upper left of the LLM settings window, click Add local model.
In the Add local large language model dialog box, select your model type.

Select the Public API or the Self-hosted tab, depending on your model type.

Public API
Self-hosted

Follow these steps:

Select the hosted platform.
Enter an API access key.
Click Test connection.
If a success confirmation message is displayed, click Next>>. (If a failure message is displayed, verify that the provided API access key is valid.)

Follow these steps:

Complete the following fields:

Field	Description
Inference engine	Select the inference engine you want to use. We support vLLM, Triton, and an OpenAI-compatible inference server.
Base URL	Enter your inference engine base URL. For example: Triton- http://my-triton-server:4000/v1 vLLM- https://my-vllm-server/v1 OpenAI Compatible- https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/
API access key	(Optional) Key to access the LLM server API if configured in the LLM server

Click Test connection.
If a success confirmation message is displayed, click Next>>. (If a failure message is displayed, verify that the provided details are valid.)

Important

To set up OpenAI-compatible inference AWS Bedrock:

Open the aws-samples/bedrock-access-gateway GitHub repository.
Follow the steps to set up the Llama3.1 model in the OpenAI Compatible API format.

On the Step 2: Add details and save tab, add the on-premises LLM server details to complete the following fields as follows:

Field	Description
Model	You can select a model from the Model details list. Warning Important The following models are supported by OpenAI when using the Public API model type: gpt-4o-mini-2024-07-18 and later gpt-4o-2024-08-06 and later
Version	Version of the LLM.
Display name	(Optional) Model in BMC AMI AI Management Console
Description	Description of the LLM
Supported integrations	(Optional) Select the supported integration.
Include this LLM in the BMC AMI Assistant chat LLM list	Enable this option if you want to use the selected LLM in the BMC AMI Assistant chat list.
Maximum number of tokens	Non-zero positive integer representing the maximum number of tokens configured for the model on the on-premises LLM server. A minimum of 100,000 tokens is required for BMC AMI Chat Assistant.

Click Save.

A success confirmation message is displayed. If this procedure fails, the user interface displays an error message with the reason for the failure. Once you have added the LLM, see Integrations settings and BMC AMI Assistant chat settings to use the LLM.

Deployed large language model (LLM)

To view the deployed LLMs

To add a local model

BMC AMI Platform 2.0

On this page