Upgrading to the latest AI model
As a tenant administrator, you upgrade to the latest version to benefit from the enhanced user experience and performance improvements offered by the fine-tuned model
for BMC Helix AIOps.BMC Helix provides the capability to bring your own GPU processing. However, you must use the fine-tuned model provided for BMC Helix AIOps.
Supported cloud platforms
Cloud platforms | Model | BMC Helix AIOps versions |
---|---|---|
| HelixGPT-v7 | 25.3 and later |
| HelixGPT-v6.2 | 25.2 and later |
Hardware and software requirements
Parameter | Google Cloud Platform Vertex AI | Microsoft Azure AI |
---|---|---|
Machine type |
| Standard_NCADSA100v4 Family Cluster Dedicated vCPUs |
GPU | NVIDIA Tesla A100 | NVIDIA Tesla A100 |
Upgrade process overview
The following graphic provides an overview of the steps required to upgrade the model in your environment:
Before you begin
Make sure you have the following details:
- Docker Inference Engine
- Model artifacts
Make sure you have the following details:
- Docker Inference Engine
- Model artifacts
- Details of the previously deployed model:
- Workspace name
- Resource group
- Container registry details
- Deployment YAML file
For more information, see Setting up your environment to leverage agentic AI capabilities in BMC Helix AIOps
Task 1: To obtain a model from BMC Helix
Contact BMC Helix support to obtain the fine-tuned model, BMC Helix AIOps HelixGPT-v7.
BMC Helix provides the model by using one of the following approaches:
- A Docker image tarball file with all model artifacts.
- The credentials and details to access the container registry where the model is available.
After you obtain the latest model, note details such as its name, the artifact path name, and the model registry path name. This information is required when you deploy the model in your cloud environment.
Task 2: To deploy the model to your cloud
Depending on the cloud environment, perform the steps to deploy the fine-tuned model.
- On a local host, extract the model artifacts provided by BMC Helix: tar -xzvf <helix_gpt_model_version>.tar.gz
- Upload the model to the Google Cloud Storage bucket:
gsutil cp -r <helix_gpt_model_version> gs://<your-bucket>/model/ - Prepare the Custom Inference Docker image:
- If you have a Docker image tarball, load the image file:
docker load -i /path/to/model_container.tar - If using the container registry, log in and pull the image from the registry:
# docker login containers.bmc.com
(Specify the credentials provided by BMC Helix)
#docker pull containers.bmc.com/bmc/lpade:helix-gpt-vllm-docker-<build_number>
(Specify the image tag provided by BMC Helix)
- If you have a Docker image tarball, load the image file:
- Push the Docker image to the Google Cloud container registry:
#docker tag <bmc helix image> <Google Cloud Container Registry tag>
#docker push <Google Cloud Container Registry path>
Now, the model and its artifacts are available in the Google Cloud Model Store and the Google Cloud Container Registry. - Navigate to Model Registry from the Vertex AI navigation menu.
- Click Import and then click Import as new version.
- On the Import Model page, select the previous version of the model, and add an optional description.
- Select the region, and click Continue.
Select the region that matches both your bucket's region and the Vertex AI regional endpoint that you are using. - Navigate to the Model settings page and select Import an existing container.
- In the Custom container settings section, click Browse in the Container image field and then click the Container Registry tab to select the container image.
- Click Browse in the Model artifact location and select the Cloud Storage path to the directory that contains your model artifacts.
- In the Arguments section, specify the following parameters and click Continue:
Field Description Example value Environment variables Specify the file name of the Deployment spec (without the file extension) included in the model artifacts. DEPLOYMENT_SPEC= zhp52uqvaxvacmt4u2tbezojfucjkf4f-helix-gpt-v6-instruct
Prediction route Specify the HTTP path to send prediction requests to. /predictions Health route Specify the HTTP path to send health checks to. /ping Port Specify the port number to expose from the container. 8080 - On the Explainability options page, retain the default No explainability option, and click Import.
After a few minutes, the model is displayed on the Models page. For more information about importing models in GCP Vertex AI, see the online documentation https://cloud.google.com/vertex-ai/docs/model-registry/import-model#custom-container.
To add model to an existing endpoint
- Select the model and then click Deploy and test.
- Click Deploy to endpoint.
- Select Add to existing endpoint and click Continue.
- On the Model Settings page, click to edit the existing model version, and update the Traffic Split value to 0.
- Click the latest upgraded version and update the Traffic Split value to 100.
- Retain the access setting value to Standard and click Continue.
- On the Model Setting Page, specify the values for the following fields and continue with default values for other fields:
- Machine Type: a2-highgpu-1g, 12 vCPUs, 85 GiB Memory
- Accelerator Type: NVIDIA Tesla A100
- Accelerator Count: 1
- Click Continue and then click Deploy.
The latest model is deployed to an existing endpoint, and all requests are routed to the new version of the model.
- Log in to the Microsoft Azure CLI:
az login - Verify that your endpoint is active and configured correctly:
az ml online-endpoint show
--name <name of the endpoint>
--resource-group <name of the resource group>
--workspace-name <name of the workspace> - Push the docker image to the Azure Container Registry:docker push helixgptreg.azurecr.io/vllm-vertex:<tag>
- Update the deployment YAML with the model version, name, path, and environment settings:
name: helix-gpt-v7-25-3-deploy
endpoint_name: helix-gpt-v7-25-3-endpoint
model:
name: helix-gpt-v7-25-3
path: ./helix-gpt-v7-25-3
version: 1
environment_variables:
AIP_HEALTH_ROUTE: "/ping"
AIP_PREDICT_ROUTE: "/score"
MODEL_BASE_PATH: "/var/azureml-app/azureml-models/helix-gpt-v7-25-3/1/helix-gpt-v7-25-3"
DEPLOYMENT_SPEC: "agaqnayhu2tstm7s3z5xmnmdugrzccsa-helix-gpt-v7_2"
AIP_STORAGE_URI: "/var/azureml-app/azureml-models/helix-gpt-v7-25-3/1/helix-gpt-v7-25-3"
environment:
image:attach:xwiki:IT-Operations-Management.Operations-Management.BMC-Helix-AIOps.aiops253.Administering.Upgrading-to-the-latest-AI-model.WebHome@filename helixgptreg.azurecr.io/vllm-vertex:25.3.00-b7a5683-15
inference_config:
liveness_route:
port: 8080
path: /ping
readiness_route:
port: 8080
path: /ping
scoring_route:
port: 8080
path: /score
request_settings:
request_timeout_ms: 180000
max_concurrent_requests_per_instance: 8
instance_type: Standard_NC24ads_A100_v4
instance_count: 1 - Deploy the updated YAML file:
az ml online-deployment update --file deployment.yaml
--resource-group HelixGPT
--workspace-name helix-gpt-v7-25-3-ws
Task 3: To verify the upgrade and test the model
After deploying the model, perform the following steps:
Test the endpoint:
curl -X POST <scoring-uri> \
-H "Authorization: Bearer <key>" \
-H "Content-Type: application/json" \
-d '{"input": "<add test prompt here>"}'
For example, if you provide a test prompt to evaluate the problem based on ranking and generate resolution steps, the results display the predictions, which indicates that the model is deployed and generates responses successfully.
There are several ways to verify whether the model is upgraded.
Perform the following steps to perform a basic verification:
- From the Google Cloud Console, navigate to Vertex AI > Endpoints.
- Click on the endpoint where you deployed the model.
Under the Deployed models tab:
- Confirm that your new model version shows up in the list.
- Check the Deployment state — it should say Deployed.
- Verify the Traffic split percentage for the new model is 100.
FAQ