Troubleshooting log collection from Kubernetes
Unable to upgrade artifacts by using the Helm upgrade command
Issue symptoms
You cannot upgrade the following artifacts by using the Helm upgrade command. This issue might occur in the following cases:
- The Helm template has changed.
- The CRD has changed.
- Multiple YAML files exist in the Helm directory.
Resolution
You must uninstall and reinstall Helm.
Perform the following steps to uninstall and reinstall Helm:
Run the following command to uninstall Helm.
helm uninstall fluent-operator -n bmc-k8s-logsRun the following command to verify artifacts that exist in the namespace.
kubectl get all -n bmc-k8s-logsRun the following command to delete the daemonset or statefulset.
kubectl delete daemonset.apps/fluent-bit statefulset.apps/fluentd service/fluentd service/fluent-bit -n bmc-k8s-logsIf some pods remain in the terminating state because the daemonset or statefulset is not deleted, delete the pods by using the following command:
kubectl delete pod <pod_ID> -n bmc-k8s-logs --force- Reinstall Helm. For more information, see Install Helm.
The Kubernetes pods fail
Issue symptoms
This issue might occur because autoscaling for the Kubernetes connector pod is not supported.
Resolution
If you want to add an additional connector node, perform the following actions:
- In the values.yaml file, go to the fluentd: section.
- Change the value of the replicas: property.
The default value is 1. Change it to the number of nodes that you want to add. Run the following command:
helm upgrade fluent-operator . -n bmc-k8s-logs -f values.yaml
Unable to run Helm commands
Issue symptoms
You cannot run Helm commands. This issue might occur because an unsupported version of Helm is installed in the controller.
Resolution
Ensure that Helm v3.2.1 or later is installed in the controller and then run the commands again.
The aggregator pod is in the Pending state
Issue symptoms
The aggregator pod appears to be in the Pending state. This issue occurs if the correct storage class is not provided in the values.yaml file for Fluentd.
Resolution
Perform the following actions:
Run the following command to check the available storage class:
kubectl get sc- If the storage class is incorrect, add the correct class in the values.yaml file.
The Kubernetes pods do not come up
Issue symptoms
The agent, aggregator, or operator pods do not come up.
Resolution
Perform the following actions:
- Verify that the image pull secret values are correct.
- If you have a token, verify that the secret value is created by using a valid token.
- Verify that the secret name in the values.yaml file is valid.
The Helm installation fails
Issue symptoms
The Helm installation fails. This issue might occur because old and invalid Custom Resource Definitions (CRDs) exist in the controller.
Resolution
Perform the following actions:
Run the following command to check if old CRDs exist in the controller:
kubectl get crd | grep fluent- If old CRDs exist, delete only the Fluent-specific CRDs.
- Install Helm again.
For information on installing Helm, see Collecting-Kubernetes-logs.
Bad Request: Mandatory parameters missing error is displayed when you click Create & Download
Issue symptoms
You get the following error when you click the Create & Download button.
Something went wrong
This issue might occur when there is an error in saving the integration.
Issue scope
This issue might affect all integrations.
Resolution
Check the pod logs for the tdc-controller-service service and search for any exceptions.
An error occurs when you run the command to create the bmc-logging namespace and other configurations
Issue symptoms
You get an error after running the following command:
kubectl apply -f <donwloaded_configuration_ yaml_file>
kubectl create -f <donwloaded_configuration_ yaml_file>
This issue might occur if you run the command from wrong directory or you do not have appropriate permissions to run the command.
Issue scope
This issue might affect all integrations.
Resolution
Confirm the following conditions:
- The downloaded file is present in the directory from where you are running the command.
- The command is run from the controller and not from a node.
- You have the appropriate privileges to create Kubernetes entities.
An invalid image error occurs after you apply the .yaml file configurations
Issue symptoms
The connector is not running.
You get the Invalid image error after running the following command:
kubectl apply -f <donwloaded_configuration_ yaml_file>
Issue scope
This issue might affect all integrations.
Resolution
Confirm the following conditions:
- There is a valid image registry URL value present at the containers : env : image path.
- The PARAMETERS.docker_container_registry_path string is not present in the daemonset definition in the .yaml file.
An error occurs after you apply the .yaml file configurations
Issue symptoms
The connector is not running.
You get the Unable to pull image error after running the following command:
kubectl apply -f <donwloaded_configuration_ yaml_file>
Issue scope
This issue might affect all integrations.
Resolution
- If you have not provided a value in the Docker Registry Path field, in the containers : env : image path, replace the PARAMETERS.docker_container_registry_path string with the docker registry URL.
- Ensure that a valid image can be pulled by using the connector image repository URL and that the URL is entered in the .yaml file at the containers : env : image path.
All pods are in the error status
Issue symptoms
No logs are collected and integration status is Configured.
The status of all pods is in error after you run the following command:
kubectl get pod -n bmc-logging
Issue scope
This issue might affect all integrations with incorrect configurations.
Resolution
- Validate the .yaml file for the ConfigMap (name: bmc-config-map) definition. Make sure it is a valid .yaml.
- Validate that the value of the ConfigMap definition follows the fluentd configuration syntax.
Error or CrashLoopBackOff status for some pods
Issue symptoms
All logs are not collected.
Status of multiple pods is error after you run the following command:
kubectl get pod -n bmc-logging
Issue scope
This issue might affect all integrations configured for a cluster.
Resolution
- Run the following command:
kubectl get pod -n bmc-logging -o wide - Find the nodes to which the Error/CrashLoopBackOff pods are assigned.
- Check the health of those nodes.
Collected logs are unavailable in Explorer
Issue symptom
No logs are being collected and shown in Explorer.
Issue scope
This issue might affet all integrations.
Resolution
- Find the nodes to which the service pods are assigned by running the following command:
kubectl get pod -o wide | grep <service name> - For each node, get the connector pod by running the following command:
kubectl get pod -n bmc-logging - o wide - Note the connector pods running on the node where service pods are also running.
- For each of these connector pods, run the following command:
kubectl exec -it <daemonset pod name> bash -n bmc-logging - Go to the /fluentd/log path, and run the following command:
cat fluent.log - In the log file, search for the string "following tail of /var/log/containers/*<name of the service pod>".
If the string is not present, the connector cannot find the logs of the service from the node. - Ensure that the service pod logs are stored on the node.