Troubleshooting EFK logging issues


Consult this topic for information about troubleshooting the EFK logging issues.

The Kibana URL is not accessible

This issue occurs because no external IP is assigned to Kibana, and Kibana cannot be accessed from outside the cluster in the same network. Perform the following steps:

  • Verify that all pods and services are running.
  • Verify that an external IP is assigned to the Kibana service. Use the following command:

    kubectl get svc -n <BMC Helix Platform namespace>
  • If no external IP is assigned, use the following command with any master-node IP:

    kubectl patch service elasticsearch-logging-kibana -n <BMC Helix Platform namespace> -p '{"spec":{"externalIPs":["'10.129.111.192'"]}}'


The Kibana pod is in CrashLoopBackOff

This issue might occur if the network settings of multiple hosts are different. Perform the following steps to specify the host of the back-end server:

  1. Edit the Kibana config map and set the following value:

    "kubectl edit cm -n <BMC Helix Platform namespace> elasticsearch-logging-kibana-conf"
          server.host: "0.0.0.0"
  2. Delete the Kibana pod by using the following command:

    kubectl delete pod <podname> -n <BMC Helix Platform namespace>


Elasticsearch pod crashes

The Elasticsearch pod crashes with the following error in the pod logs:

ElasticsearchException[failed to bind service]; nested: AccessDeniedException[/usr/share/elasticsearch/data/nodes];
Likely root cause: java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data/nodes

Workaround

In the opendistro-es/templates/elasticsearch/es-data-sts.yaml and opendistro-es/templates/elasticsearch/es-master-sts.yaml files, add the following lines in the sts configuration:

fsGroup: 1000
runAsUser: 1000

After adding the lines, the sts configuration appears as shown below:

name: config
subPath: logging.yml
dnsPolicy: ClusterFirst
imagePullSecrets:
name: bmc-dtrhub
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 1000
runAsUser: 1000
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
volumes:


 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*