Deployed large language model (LLM)


This topic describes how to view the deployed large language models (LLMs) and their details in one place. You can also add a local LLM.

To view the deployed LLMs

  1. Sign in to BMC AMI Platform by using your credentials. 
  2. Click the Platform manager tab.
  3. From the menu in the left pane, select BMC AMI AI Manager > AI services settings > Deployed LLMs.
    The Deployed Large Language model (LLM) window lists all the deployed LLMs.

    1762164554927-229.png

To add a local model

  1. On the upper left of the LLM settings window, click Add local model.
  2. In the Add local large language model dialog box, select your model type.
  3. Select the Public API or the Self-hosted tab, depending on your model type. 

    Follow these steps:

    1. Select the hosted platform. 
    2. Enter an API access key.
    3. Click Test connection.
    4. If a success confirmation message is displayed, click Next>>. (If a failure message is displayed, verify that the provided API access key is valid.)

    Follow these steps:

    1. Complete the following fields:

      Field

      Description

      Inference engine  

      Select the inference engine you want to use. We support vLLM, Triton, and an OpenAI-compatible inference server. 

      Base URL 

      Enter your inference engine base URL.
      For example: 

      API access key 

      (Optional) Key to access the LLM server API if configured in the LLM server 

    2. Click Test connection.
    3. If a success confirmation message is displayed, click Next>>. (If a failure message is displayed, verify that the provided details are valid.)
    Warning
    Important

    To set up OpenAI-compatible inference AWS Bedrock:

    1. Open the aws-samples/bedrock-access-gateway GitHub repository.
    2. Follow the steps to set up the Llama3.1 model in the OpenAI Compatible API format.  


  1. On the Step 2: Add details and save tab, add the on-premises LLM server details to complete the following fields as follows:

     

    Field

    Description

    Model

    You can select a model from the Model details list.

    Warning

    Important

    The following models are supported by OpenAI when using the Public API model type:

    • gpt-4o-mini-2024-07-18 and later
    • gpt-4o-2024-08-06 and later

    Version

    Version of the LLM.

    Display name 

    (Optional) Model in BMC AMI AI Management Console  

    Description 

    Description of the LLM

    Supported integrations

    (Optional) Select the supported integration.

    Include this LLM in the BMC AMI Assistant chat LLM listEnable this option if you want to use the selected LLM in the BMC AMI Assistant chat list.

    Maximum number of tokens

    Non-zero positive integer representing the maximum number of tokens configured for the model on the on-premises LLM server. A minimum of 100,000 tokens is required for BMC AMI Chat Assistant.

  2. Click Save.

A success confirmation message is displayed. If this procedure fails, the user interface displays an error message with the reason for the failure. Once you have added the LLM, see Integrations settings and BMC AMI Assistant chat settings to use the LLM.

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*

BMC AMI Platform 2.0