Monitoring BMC Helix on-premises environments


One option to monitor your BMC Helix Service Management and BMC Helix IT Operations Management (BMC Helix ITOM) on-premises environments is to use a monitoring stack of Prometheus and Grafana, along with monitoring dashboards. Prometheus is a metrics database. It monitors the services, collects metrics, and then stores them. This data is then made available to Grafana. Grafana is a data visualization or dashboarding tool.

The following dashboards are available:

Dashboard

Purpose

Reference ID

Resource Utilization

Monitor container memory and CPU for over-utilization.

Included in bundle

Elasticsearch

Track the cluster's health based on application and infrastructure resource usage and feature usage.

2322

(Apache) Kafka

Monitor topic and broker message throughput and also broker size.

7589

PostgreSQL

Monitor connections, transaction rates, and query performance.

9628

Redis

Monitor resource usage, command performance, and database size.

11835

VictoriaMetrics

Monitor database transaction performance and uptime.

11176

Persistent Volumes

Monitor utilization and performance of Persistent Volumes within the Kubernetes cluster.

13646

Before you begin

Install Prometheus and Grafana. See https://github.com/prometheus-operator/prometheus-operator or your Kubernetes engine documentation.

To monitor BMC Helix on-premises environments

  1. Install the service monitor chart by using the following commands:

    helm install helix-monitoring ./chart --set namespace=<namespace in which platform is installed> -n <namespace in which Prometheus is deployed>

    kubectl -n <namespace in which Prometheus is deployed> describe prometheuses.monitoring.coreos.com | grep -A 2 'Service Monitor Selector'

    Sample output:

    Service Monitor Selector:   
    Match Labels:    
    Release: <required label value>

    Run the following command to update the service monitors with the label:

    kubectl -n <namespace in which Prometheus is deployed> label servicemonitor release=<required label value> all

    The service monitor chart installs relevant service monitors into Prometheus to pull metrics from the data lake clusters.

  2. Import the cluster-monitor.json file: 
    1. Log in to Grafana.
    2. Select Dashboards > New > Import.
      Illustration of the selection.
    3. Paste the contents of the cluster-monitor.json file into Import via dashboard JSON model, and click Load.
      Import via dashboard JSON model
    1. Alternatively, click Upload JSON file and upload the cluster-monitor.json.
    2. (Optional) Under Options, in the Name field, enter the folder name where you want to save the cluster-monitor.json file or select a folder from the Folder list.
    3. Click Import.
  1. Import an out-of-the-box dashboard:
    1. Select Dashboards > New > Import.
      dashboards-import.png
    2. In the import via Grafana.com field, enter the ID of the dashboard that you want to import.
      For example, to import the Elasticsearch dashboard, enter 6483.
      Find and import dashboards

      We recommend that you import all of the following referenced dashboards:

      Dashboard

      Reference ID

      Elasticsearch

      2322

      Kafka

      7589

      PostgreSQL

      9628

      Redis

      11835 

      VictoriaMetrics

      11176

      Persistent Volumes

      13646

      The dashboard IDs are present in the dashboard-import.txt file.

    3. (Optional) Under Options, in the Name field, enter the folder name where you want to save the dashboard or select a folder from the Folder list.
    4. From the Prometheus list, select Prometheus.
    5. Click Import.
      After the import is complete, you can view the dashboard.
  2. Repeat step 4 to import other dashboards that you want.

Adding a service monitor or JMX agent port

You can scrape into Promethus by adding a service monitor/JMX agent port by using the the following details:

  • For BMC Helix Digital Workplace, you can scrape using the metrics endpoint on port 8080.
  • For BMC Helix Innovation Studio and Smart IT, you can scrape using the metrics endpoint on port 7070.

 

To view out-of-the-box dashboards on Grafana

  1. Log in to Grafana.
  2. Select Dashboards > Browse.

Important metrics for monitored components

The out-of-the-box dashboards monitor the following components:

Open Distro ElasticSearch

OpenDistro ElasticSearch is a data store that is specifically used for Event management and Log Analytics. The Open Distro version has additional security and operational features to make it more secure and scalable for BMC customers.

Important metrics include:

  • CPU, memory, and storage
  • Cluster and index status (as green, yellow, or red)
  • Replication
  • Number of shards per node
  • Request rate
  • Indexing rate

Kafka

Kafka is a message broker that is used for asynchronous communication between services and that handles the ingestion of metrics, events, logs, and topology. This component is a key piece of BMC Helix when looking at the real-time availability of metrics, events, logs, and topology brought into the system. 

Important metrics include:

  • CPU, memory, and storage
  • Topic lag
  • Replication status
  • Partitions per cluster
  • Ingestion rate

Victoria Metrics

Victoria Metrics is a highly scalable time-series data store that is particularly used for raw and aggregated metric storage. This component is used for BMC Helix Operations Management and BMC Helix AIOps metrics.

Important metrics include:

  • CPU, memory, and storage
  • Ingestion rate
  • Request rate
  • Rows per insert

PostgreSQL

PostgreSQL is an extended SQL database. Many features are added to scale and store complex data workloads. BMC Helix uses this component for configuration and traditional relational data, such as users, groups, and roles.

Important metrics include:

  • CPU, memory, and storage
  • Transaction rates
  • Number of connections
  • Query completion time
  • Replication lag

Redis

Redis is an in-memory NoSQL key and value store that provides readily available caching. Redis can store several types of data structures. This cache is used for several use cases, including session management and topology retrieval.

Important metrics include:

  • CPU, memory, and storage
  • Number of connections
  • Command execution rate
  • Duration of objects

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*

BMC Helix On-premises Deployment