Troubleshooting monitoring solution setup issues


Use the information in this topic to troubleshoot issues related to monitoring on-premises BMC Helix IT Operations Management. 

The  BMC Helix Log Analytics connector for Kubernetes fails

Perform the following steps to resolve the issue:

  1. Add the following service accounts to the privileged Security Context Constraints (SCC):
    oc adm policy add-scc-to-user privileged -z fluent-bit -n bmc-k8s-logs
    oc adm policy add-scc-to-user privileged -z fluentd -n bmc-k8s-logs
    oc adm policy add-scc-to-user privileged -z fluent-operator -n bmc-k8s-logs
  2. In the values.yaml file, go to the Fluentbit  section, and add the following key-value pair under securityContext:
    Privileged: true
  3. Re-run Helm upgrade.

 

Log Analytics fluentbit connector pod does not start

You get the following error:
Error: failed to mkdir /containers: mkdir /containers: operation not permitted

Perform the following steps to resolve the issue:

  1. In the values.yaml file, change the value of the parameter accessMode to ReadWriteOnce.
    TS1.png
  2. Re-run Helm upgrade.

 

Prometheus Exporter for Apache Zookeeper not collecting data

Perform the following steps:

  1. To debug the issue, run the following curl command within your BMC Helix Agent Monitor: 

    oc project openshift-monitoring

    oc create sa bhom-self-monitoring-sa

    oc adm policy add-cluster-role-to-user cluster-monitoring-view -z bhom-self-monitoring-sa

    SECRET=`oc -n openshift-monitoring describe sa bhom-self-monitoring-sa | awk '/Tokens/{ print $2 }'`

    TOKEN=`oc -n openshift-monitoring get secret $SECRET --template='{{ .data.token | base64decode }}'`


    curl -k -H "Authorization: Bearer ${TOKEN}" 'https://thanos-querier.openshift-monitoring.svc.cluster.local:9091/api/v1/query_range?query=%7B__name__=~%22ack.*|add.*|approximate.*|bytes.*|close.*|commit.*|concurrent.*|connection.*|dbinittime.*|dead.*|diff.*|digest.*|election.*|follower.*|fsynctime.*|global.*|jvm.*|leader.*|learner.*|learners.*|local.*|looking.*|max.*|node.*|num.*|om.*|open.*|outstanding.*|packets.*|pending.*|prep.*|propagation.*|proposal.*|quit.*|quorum.*|read.*|readlatency.*|reads.*|request.*|requests.*|response.*|revalidate.*|server.*|session.*|sessionless.*|snap.*|snapshottime.*|stale.*|startup.*|sync.*|synced.*|time.*|unrecoverable.*|watch.*|write.*|znode.*%22%7D&start=2024-06-24T00:00:45.239834Z&end=2024-06-24T00:05:00.919410Z&step=60s'

     

  2. Change the date range and make sure you are getting metrics.

 

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*