Monitoring a Kubernetes cluster

Objective

To monitor the availability and performance of the critical areas of your Kubernetes cluster.

User persona

The tasks in this end-to-end use case must be performed by a tenant administrator.

Step 1: Identify components

Identify the Kubernetes cluster to monitor.
Identify a PATROL Agent to monitor the Kubernetes cluster.

Step 2: Create and deploy a package

Create a package with the Kubernetes KM.
On the Administration > Repository > Deployable Packages tab, click Create. Do the following using Next to move through the UI pages:
1. Select the operating system and platform for which you want to create a package. The list of platforms changes according to the operating system that you select.
2. Select the Kubernetes KM.
3. Select all sub-components.
4. Specify the installation directory for the package.
5. Enter the PATROL Agent product directory.
6. Verify the PATROL Agent name, and enter the PATROL Agent account credentials.
7. Add the PATROL Agent port number and choose to restart the PATROL agent automatically or manually.
8. Add a name and description for the package, and select the file format.
9. Save the package.
Deploy the package to the PATROL Agent.
On the Configuration > Agents page:
1. Do one of the following:
  - To deploy and install packages on a single device, click the device action menu.
  - To deploy and install packages on multiple devices, select the devices by holding the Shift key, and click the action menu from the column heading.
2. Click Deploy and Install Packages.
You can hover your mouse over the icon in the Deploy Status column to view the current installation status.
If the package deployment fails in validation, an error message is displayed. For troubleshooting information, see Deploying packages.

Step 3: Configure a monitoring policy

On the Configuration > Monitor Policies page, click Create, and do the following:

On the Create Monitoring Policy page, add a unique name and description for the monitoring policy. BMC recommends that you organize policies according to the precedence numbers when creating and editing policies. You can also include policy-specific information in the policy names.
Add the associated user group for the policy.
An associated user group is the user group that the logged-on user belongs to. If the user belongs to multiple user groups, select the appropriate user group for the policy.
If you want to share the policy with the user group that you selected, select the Share with User Group checkbox.
Add a unique precedence number to the policy.
You can add a custom value in this field, or use the arrows to increase or decrease the value.
If you want to enable the policy immediately, select Enable Policy. Alternatively, you can enable it later from the Monitor Policies page.
Create the PATROL Agent selection criteria based on which the policy must be applied to the Agents. Add the agent selection criteria. Use the following options:
- To add more than one condition, click ; to remove an existing condition, click .
- To group the conditions, use the parentheses and Boolean operators from their corresponding lists.
Add configuration details in the Monitoring tab:
1. Specify the Kubernetes cluster details.
  - Master Node: The host name or IP address of the Kubernetes master node. Run the kubectl cluster-info command on the cluster to obtain the master (API server) hostname or IP address.
  - Port Number: The port number to connect to the Kubernetes master node. The default port number is 6443. Run the kubectl cluster-info command on the cluster to obtain the port number of the master node.
  - Authentication Type: Certificate-based or token-based authentication type to connect to the Kubernetes cluster.
  - Client Certificate File Path (.pfx): (For certificate-based authentication) The absolute path of the client certificate file on the PATROL Agent server. The client certificate file must be in the .pfx format. Download this script to create the .pfx client certificate.
    Note
    
    The client certificate file must reside on the host where the PATROL agent is running and the BMC PATROL default account must have read permission to the client certificate file.
  - Client Certificate Password: (For certificate-based authentication) The password to access the client certificate file.
  - Authentication Token: (For token-based authentication) The bearer token to connect to the Kubernetes cluster. Download this script to create a service account and obtain the bearer token.
2. Configure the Proxy Server.
  - Use Proxy Configuration: Enables a proxy configuration.
  - Server Name: The host name or IP address of the proxy server used to route the HTTP requests.
  - Port: The proxy server port number.
  - User Name: The proxy server username.
  - Password: The password of the specified proxy server username.
3. Configure the Namespace filter.
  - Namespace Filter Type: Include or exclude the Kubernetes cluster namespaces from monitoring.
  - Namespace Filter: The Kubernetes cluster namespaces to include or exclude from monitoring. You can enter the exact Kubernetes cluster namespace names or a regular expression matching multiple namespaces. To add multiple entries, enter a pipe-separated list of namespaces.
    Examples:
    To filter a single namespace MyNamespace, enter the name MyNamespace
    To filter namespaces ProdNamespace and QANamespace, enter the regular expression ProdNamespace|QANamespace
    To filter all namespaces that start with the word Test, enter the regular expression Test.*
  - Enable Containers Monitoring: If enabled, the KM discovers the Containers below the Pod instances.
4. Configure the administration.
  - JVM Arguments: The additional Java Virtual Machine arguments for the Java collector. For example, enter the following settings for Java memory: -Xms256m -Xmx1024m.
  - Device Mapping:
    - Node - Enables device mapping of nodes. The KM creates the node device by resolving the DNS from its IP address. If device mapping is disabled for nodes, the nodes are displayed as instances in BMC Helix Operations Management in their respective hierarchy below the PATROL Agent device.
    - Pod - Enables device mapping of pods. The KM creates the device by using the name and the IP address of the pod. If device mapping is disabled for pods, the pods are displayed as instances in BMC Helix Operations Management in their respective hierarchy below the PATROL Agent device.
    - Container - Enables device mapping of containers. The KM creates the container device by concatenating pod and container names. For example: <pod-name>-<container-name>. If device mapping is disabled for containers, the containers are displayed as instances in BMC Helix Operations Management in their respective hierarchy below the PATROL Agent device.
    Note: If you modify this field, restart the PATROL Agent to apply the changes.
5. Specify the path to the JRE directory on the PATROL Agent server.
  For example, if the JAVA location on the PATROL Agent server is /usr/java/jdk1.8.0_45/jre/bin/java, specify /usr/java/jdk1.8.0_45/jre as the value.
  If the specified path does not exist or if you leave this field blank, the KM searches for the JRE in the following directory order:
  1. <PATROL_HOME>/openjdk
  2. <PATROL_HOME>/jre64
Save the policy.

Step 4: Create alerts

On the Configuration > Alarm Policies page, click Create, and do the following:

Specify a unique name and an optional description.
Create the conditions based on which the alarm is generated.
1. Metric and instance details: The metric and the number of instances for which you want to add this policy. Select between all instances and multiple instances.
  Note that you cannot create multiple policies with duplicate metric information. Policies with duplicate metric information can be added only if you specify different instance types (all and multiple). In this case, the policy set for multiple instances gets precedence.
2. Threshold details and post trigger actions: The threshold value, violation duration, and details about when the generated alarm must be closed eventually.
  Specify if the event must be closed immediately after it is generated or after the metric reaches a normal state and a duration equal to the violation time period has lapsed. You can also specify that the event must not be closed. Alarm events that are not closed remain open until they are closed manually, the policy is deleted, or the PATROL Agent associated with the alarm is deleted. To change any of the values, click them.
Save the policy.

Step 5: Enrich and notify

On the Configuration > Event Policies page, click Create, and do the following:

Specify a unique name, optional description, and precedence number for the policy.
Create the event selection criteria based on which the policy is applied to the events.
Select the time frame for which the policy must be active.
By default, the policy is set to Always Active.
You can also set it to be active during a specific time.
Select the following policy types and configure them.
- Enrichment: Processes events with refined attribute values to make the events more meaningful.
  Select the required settings and specify the values.
- Notification: If the notification service is Email, the policy notifies users via email that an event has occurred, so that appropriate actions can be taken.
  Select the required settings and specify the values.
Select Enable Policy.
You can enable or disable the policy any time from the Event Policies page.
Save the policy.

Best practice

BMC recommends that you set up groups and dashboards to visualize information better. For more information, see Setting up groups and Viewing data in BMC Helix Dashboards.

Monitoring a Kubernetes cluster

Objective

User persona

Comments