This documentation supports an earlier version of BMC Helix IT Service Management on-premises deployment.To view the documentation for the latest version, select 23.3.04 from the Product version picker.

Troubleshooting EFK logging issues


Consult this topic for information about troubleshooting the EFK logging issues.

The Kibana URL is not accessible

This issue occurs because no external IP is assigned to Kibana, and Kibana cannot be accessed from outside the cluster in the same network. Perform the following steps:

  • Verify that all pods and services are running.
  • Verify that an external IP is assigned to the Kibana service. Use the following command:

    kubectl get svc -n bmc-helix-logging
  • If no external IP is assigned, use the following command with any master-node IP:

    kubectl patch service elasticsearch-logging-kibana -n ade-logging -p '{"spec":{"externalIPs":["'10.129.111.192'"]}}'


The Kibana pod is in CrashLoopBackOff

This issue might occur if the network settings of multiple hosts are different. Perform the following steps to specify the host of the back-end server:

  1. Edit the Kibana config map and set the following value:

    "kubectl edit cm -n ade-logging elasticsearch-logging-kibana-conf"
          server.host: "0.0.0.0"
  2. Delete the Kibana pod by using the following command:

    kubectl delete pod <<podname>> -n <<namespace>>


The Fluentd daemon set pods are not visible

This issue occurs if the rbac or psp values are not set correctly in the chart_value.yaml file. Perform the following steps:

  • Ensure that the helix-on-prem-deployment-manager/bmc-helix-logging/efk/fluentd/chart_value.yaml file has the following setting depending on the Kubernetes management platform:
    • (For Rancher Kubernetesrbac=truepsp=true
    • (For OpenShift Kubernetes) rbac=truepsp=false
  • Ensure that the fluentd-privileged-binding role binding is present in the logging namespace.


Logs are not displayed in Kibana

This issue occurs when the forwarder runs the container as a non-root user. In the helix-on-prem-deployment-manager/bmc-helix-logging/efk/fluentd/chart_value.yaml file, verify that the securityContext of the forwarder has the following values:

securityContext:
enabled: true
runAsUser: 0
runAsGroup: 0
fsGroup: 0


EFK pods restart

This occurs because the Fluentd Daemonset checks the health of the nodes. The pods restart until the Fluentd Daemonset receives the healthy status of the nodes.

If the installer displays the following message, it means that Fluentd needs more time than the default timeout duration in receiving the health status of the nodes:

ERROR: Failed to install helm chart: fluentd.
ERROR: Failed to install EFK-Fluentd.

Workaround

  • Wait till the Fluentd pods start.
  • Manually restart the nodes or restart the docker service.


Elasticsearch pod crashes

The Elasticsearch pod crashes with the following error in the pod logs:

ElasticsearchException[failed to bind service]; nested: AccessDeniedException[/usr/share/elasticsearch/data/nodes];
Likely root cause: java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data/nodes

Workaround

In the opendistro-es/templates/elasticsearch/es-data-sts.yaml and opendistro-es/templates/elasticsearch/es-master-sts.yaml files, add the following lines in the sts configuration:

fsGroup: 1000
runAsUser: 1000

After adding the lines, the sts configuration appears as shown below:

name: config
subPath: logging.yml
dnsPolicy: ClusterFirst
imagePullSecrets:
name: bmc-dtrhub
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 1000
runAsUser: 1000
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
volumes:



 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*