Installing Health Check Tool
Use the Health Check Tool (HCT) to verify whether your environment meets the configuration and tuning recommendations. By running this tool at different stages of your BMC Helix IT Operations Management installation, you can identify potential issues early and address them before they affect your installation.
Before you beginEdit
Make sure that Java 17 is installed on your system.
Installing the Health Check ToolEdit
Download the BMC Helix Platform Health Check Tool Version file from BMC Helix EPD.
Extract the file to a temporary directory on the controller node where you will deploy BMC Helix IT Operations Management.
Example /tmp/hct.Open the sample-env-setting.sh file and update the following environment variables. Make sure the values point to the correct locations on your system:
Variable Description JAVA_HOME Specify the path name to the installed Java runtime.
To determine the correct path name:- To locate the Java binary, run the which java command at the command prompt
- To get the full path name, run readlink -f $(which java)
Copy the path name up to (but not including) the /bin directory.
Example
If the Java path name is /usr/lib/jvm/java-17-openjdk-amd64/bin/java, then set:export JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64INSTALL_DIR Specify the path name to the directory where the BMC Helix on-prem deployment manager files are extracted.
Example
export INSTALL_DIR=/root/helix-on-prem-deployment-managerHCT_HOME To run commands from a location outside the extracted folder, specify the path name to the Health Check Tool directory.
Example
export HCT_HOME=/tmp/hct(Optional) SLEEP_DURATION_SECONDS
Specify the delay in seconds between readiness checks during specific pre-install validations, such as the NFS daemonset pod readiness loop or NS lookup daemonset pod.
Increase this value if image pulls are slow or if readiness checks require additional time.
Example
export SLEEP_DURATION_SECONDS=20(Optional) INGRESS_NAMESPACE
Specify the namespace where your Ingress controller is deployed. Set this variable only if your Ingress controller is not running in the default Ingress-nginx namespace.
Example
export INGRESS_NAMESPACE=<your-ingress-namespace>(Optional) SKIP_DAEMONSET_CLEAN
Set this variable to true to prevent the deletion of the daemonset pods created during pre-install checks.
Example
export SKIP_DAEMONSET_CLEAN=true(Optional) SKIP_EMAIL_TEST Set this variable to true to skip the SMTP/email validation step during pre-install checks.
Example
export SKIP_EMAIL_TEST=trueAfter updating the environment variables, run the following command to export these variables into your environment:
source sample-env-setting.sh
Running the Health Check ToolEdit
Run the Health Check Tool to validate your environment at different stages of the BMC Helix IT Operations Management deployment.
You can perform the following checks by using the Health Check Tool:
- Cluster health check
- Pre-installation check
- Installation-time check
- Post-installation check
To run cluster health check
Performing a cluster health check helps you validate the readiness and current health of the cluster. Run the following command at the command prompt:
./helix-healthcheck.sh clustercheck
The cluster health check performs the following validations:
- Monitors CPU and memory usage by using configurable thresholds, with the default threshold set to 90%.
- Verifies that all the worker nodes are in the Ready state.
- Tracks node resource validation by comparing the pod resource requests with the available node capacity. This validation requires the Kubernetes Metrics Server to be available on the cluster.
- Executes CephCluster health assessment for OpenShift storage environments. The cluster health check supports only CephCluster storage validation, which is executed within the OpenShift-storage namespace.
To run pre-installation check
Before you begin the BMC Helix IT Operations Management installation, perform a pre-installation check to verify your environment configuration by running the following command:
./helix-healthcheck.sh preinstall
This check validates the following critical components:
- Availability of kubectl or oc
- Required binaries and variables
- Hardware and storage setup
- Certificate configurations
- Load balancer and DNS connectivity
- SMTP server setup
- Node health and time synchronization
In addition, when upgrading BMC Helix IT Operations Management, this check also performs the following validations:
- To verify the reachability of Kubernetes core services, this check performs DNS resolution tests from pods on each node using pods deployed in the Ingress‑Nginx namespace.
- Validates pod status and identifies pods that are not in the Running or Completed state.
- Monitors container limit threshold and generates alerts when the containers approach CPU or memory limit thresholds.
To run installation-time check
If the installation process becomes unresponsive or fails, perform installation-time check to identify the potential issues by running the following command:
./helix-healthcheck.sh install
The installation-time check analyzes:
- Pod logs
- RSSO URL connectivity
To run post-installation check
After completing the BMC Helix IT Operations Management installation, perform post-installation check to verify the health of your system by running the following command:
./helix-healthcheck.sh postinstall
The post-installation check analyzes:
- Pod logs
- RSSO URL connectivity
In addition, it performs an in-depth validation of your infrastructure stack, which includes the following checks:
- Validates environment variables for each infrastructure component.
- Verifies the correct resource limits and security context configurations.
- Supports optional component deployments.
- Detects deployed infrastructure components.
- Validates infrastructure pod counts and overall pod health.
- Performs connection testing and functional checks for the following components:
- PostgreSQL
- OpenSearch
- Kafka (including replication lag monitoring)
- Zookeeper
- Valkey
- MinIO / S3 bucket accessibility
- Victoria Metrics
- Redis cluster (optional)
Health check reports
The Health Check Tool generates the following reports that help you review the health of the environment and identify potential issues:
- Log files generated in $HCT_HOME/logs folder.
- An HTML summary report generated in $HCT_HOME/reports folder.
Uninstalling the Health Check ToolEdit
To uninstall the Health Check Tool, delete the extracted directory by using the following command: