Moviri – K8s (Kubernetes) Prometheus Extractor
“Moviri Integrator for BMC Helix Continuous Optimization – k8s (Kubernetes) Prometheus” is an additional component of BMC BMC Helix Continuous Optimization product. It allows extracting data from the Kubernetes cluster management system, a leading solution to manage cloud-native containerized environments. Relevant capacity metrics are loaded into BMC Helix Continuous Optimization, which provides advanced analytics over the extracted data in the form of an interactive dashboard, the Kubernetes View.
The integration supports the extraction of both performance and configuration data across different component of the Kubernetes system and can be configured via parameters that allow entity filtering and many other settings. Furthermore, the connector is able to replicate relationships and logical dependencies among entities such as clusters, nodes, namespaces, deployments, and pods.
The documentation is targeted at BMC BMC Helix Continuous Optimization administrators, in charge of configuring and monitoring the integration between BMC BMC Helix Continuous Optimization and Kubernetes.
Step I. Complete the pre-configuration tasks
|
|---|
Step II. Configure the ETL
A. Configuring the basic properties
Some of the basic properties display default values. You can modify these values if required.
To configure the basic properties:
- In the console, navigate to Administration > ETL & System Tasks, and select ETL tasks.
- On the ETL tasks page, click Add > Add ETL. The Add ETL page displays the configuration properties. You must configure properties in the following tabs: Run configuration, Entity catalog, and Amazon Web Services Connection
- On the Run Configuration tab, select Moviri - k8s Prometheus Extractor from the ETL Module list. The name of the ETL is displayed in the ETL task name field. You can edit this field to customize the name.
Click the Entity catalog tab, and select one of the following options:
Shared Entity Catalog:
- From the Sharing with Entity Catalog list, select the entity catalog name that is shared between ETLs.
- Private Entity Catalog: Select if this is the only ETL that extracts data from the k8s Prometheus resources.
- Click the Connection tab, and configure the following properties:
| Property Name | Value Type | Required? | Default | Description |
| Prometheus – API URL | String | Yes | Blank | Prometheus API URL (http/https://hostname:port). Port can be Omitted. |
| Prometheus – API Version | String | Yes | v1 | Prometheus API version, this should be the same as the Kubernetes API version if using any. |
| Prometheus – API Authentication Method | String | Yes | No Authentication | Prometheus API authentication method. These are the methods that are supported:
|
| Prometheus – Username | String | No | Blank | Prometheus API username if the Authentication method is set to Basic Authentication. |
| Prometheus – Password | Password | No | Blank | Prometheus API password if the Authentication method is set to Basic Authentication. |
| Prometheus – API Authentication Token | Password | No | Blank | Prometheus API Authentication Token (Bearer Token) if the Authentication method is set to Authentication Token. |
| OAuth - URL | String | No | Blank | URL for the OpenShift OAuth (http/https://hostname:port). Only available if Authentication Method is set to "OpenShift OAuth". Port can be emitted. |
| OAuth - Username | String | No | Blank | OpenShift OAuth username if the Authentication Method is set to OpenShift OAuth. |
| OAuth - Password | String | No | Blank | OpenShift OAuth password if the Authentication Method is set to OpenShift OAuth. |
| Generic OAuth - URL | String | No | Blank | URL for the OAuth (http/https://hostname:port). Only available if Authentication Method is set to "Generic OAuth". Port can be emitted. |
| Generic OAuth - Client ID | String | No | Blank | Generic OAuth Client ID if the Authentication Method is set to Generic OAuth. |
| Generic OAuth - Client Password | Password | No | Blank | Generic OAuth Client password if the Authentication Method is set to Generic OAuth. |
| (Optional) Generic OAuth - Resource Server | String | No | Blank | Generic OAuth Resource Server if the Authentication Method is set to Generic OAuth. |
| MS Azure Key Vault - Host | String | No | Blank | The Azure Key Vault host name when the Authentication Method Key Vault is selected. |
| MS Azure Key Vault - Secret Name | String | No | Blank | The Azure Key Vault secret when the Authentication Method Key Vault is selected. |
| Client Certificate - Keystore path | String | No | Blank | The Java Keystore path located on the Scheduler running the ETL. Only available when the Authentication Method |
| Client Certificate - Keystore password | Password | No | changeit | The password for the Java Keystore needed when Client Certificate is selected as the Authentication Method. |
| Client Certificate - Truststore path | String | No | Blank | The Java Truststore path located on the Scheduler running the ETL. Only available when the Authentication Method |
| Client Certificate - Truststore password | Password | No | changeit | The password for the Java Truststore needed when Client Certificate is selected as the Authentication Method. |
| Prometheus – Use Proxy Server | Boolean | No | Blank | If a proxy server is used when chose either Basic Authentication or None. Proxy sever supports HTTP. Proxy server only support Basic Authentication and None Authentication. |
| Prometheus - Proxy Server Host | String | No | Blank | Proxy server host name. |
| Prometheus - Proxy Server Port | Number | No | Blank | Proxy server port. Default 8080. |
| Prometheus - Proxy Username | String | No | Blank | Proxy server username |
| Prometheus - Proxy Password | String | No | Blank | Proxy server password |
| Use Kubernetes API | Boolean | Yes | Blank | If use Kubernetes API or not. |
| Kubernetes - API Host | String | Yes | Blank | Kubernetes API server host name For Openshift, use the Openshift console FQDN (e.g., console.ose.bmc.com). |
| Kubernetes - API Port | Number | Yes | Blank | Kubernetes API server port For Openshift, use the same port as the console (typically 8443). |
| Kubernetes - API Protocol | String | Yes | HTTPS | Kubernetes API protocol, "HTTPS" in most cases |
| Kubernetes - API Authentication Token | Password | No | Blank | Token of the integrator service account (see data source configuration section). If the Authentication Method is set to OpenShift OAuth, then you do not need to put a token in this field. Please make sure the User account has the right permissions specified in the data source configuration section for Kubernetes API access. |
- Click the Kubernetes Extraction tab, and configure the following properties:
| Property Name | Value Type | Required? | Default | Description |
| Data Resolution | String | Yes | 1 Hour | Data resolution for data to be pulled from Prometheus into BHCO. |
| Cluster Name | String | No | Blank | If Kubernetes API is not in use, cluster name must be specified. |
| Default Last Counter | String | Yes | Default earliest time the connector should be pulling data from in UTC. Format as YYYY-MM-DDTHH24:MI:SS.SSSZ, for example, 2019-01-01T19:00:00.000Z. | |
| Lag Hour to the Current Time | String | No | Lag hour to the current time | |
| Extract PODWorkload | Boolean | Yes | No | If want to import podworkload |
| Filter PodWorkload using namespace | Boolean | No | No | If extracting PodWorkload, choose to filter pod workloads using namespaces. This uses a semi-colon separated list, and only uses exact matching. |
| Filter PodWorkloads using namespace allowlist/denylist | String | No | denylist | If filtering PodWorkload, choose to filter using a denylist or allowlist. |
| Import only PODWorkloads not in given namespaces (semi-colon separated list) | String | No | If filtering using denylist, don't import data for pod workloads in given namespace denylist. If filtering using allowlist, only import data for pod workloads in given namespace allowlist. All other data for given namespaces will still be imported. When empty, don't filter pod workloads. | |
| Extract Controller Metrics | Boolean | Yes | Yes | If the user wants to import Controller (Deployment, DaemonSet, ReplicaSet..) metrics. If POD Workload and Controller are both not selected, the ETL will not create any Controller metrics. if Controllers are not selected and POD Workload are selected for import, the ETL will create empty Controller entities. |
| Maximum Hours to Extract | String | No | 120 | Maximum hours the connector can pull data from. If leave empty, using default 5 days from default last counter. Please consider that if no data is found on Prometheus, the integration will update the last counter to the maximum extraction period of consideration, starting from the last counter. |
| Import by image metric on all entities? | String | No | No | If want to import BYIMAGE metrics on cluster, namespace, controller and nodes |
| Import podworkload metric highmark counters? | String | No | No | If want to import Highmark metrics for CPU_USED_NUM, MEM_USED on podworkload for by container metrics |
| Import annotations | Boolean | No | No | If want to import select Kubernetes Annotations as labels. Requires Kubernetes API is used. |
| Annotations to import (semi-colon separated list) | String | No | Allowlist of Kubernetes Annotations to import as labels | |
| Import only the following tags (semi-colon separated list) | String | No | Import only tag types (Kubernetes Label key). Specify the keys of the labels. They can be in the original format appears on Kubernetes API, or they can be using underscore (_) as delimiter. For example, node_role_kubernetes_io_compute and node-role.kubernetes.io/compute are equivalent and will be imported as node_role_kubernetes_io_compute. | |
| Enable Recovery Mode | Boolean | No | No | Only import data for entities between starting and ending timestamps. Object relationship data for this time period will not be collected. |
| Start timestamp of recovery mode | String | Yes | Starting timestamp of extraction when recovery mode is enabled. | |
| End timestamp of recovery mode | String | Yes | Ending timestamp of extraction when recovery mode is enabled. | |
| Partition Size (Default: 400, Max: 999) | String | No | 400 | Size of partition used for aggregation controller data. |
| Create UID configuration metrics | Boolean | No | No | Choose if configuration metrics containing UID value of entities will be created, if Kube API is also enabled. |
| Extract additional max statistics for CPU/MEMORY QUOTA metrics for LIMITS/REQUESTS | Boolean | No | No | Choose to import data for CPU_QUOTA_LIMIT, CPU_QUOTA_REQUEST, MEM_QUOTA_LIMIT, MEM_QUOTA_REQUEST using max aggregation as a second statistic. |
The following image shows a Run Configuration example for the “Moviri Integrator for BMC Helix Continuous Optimization – k8s Prometheus":
- Click on the Import Filters tab, and configure the following property, if needed:
| Property Name | Value Type | Required? | Default | Description |
| Path to filtering configuration file | String | No | Specify the path on the Remote ETL Engine where the file containing the filtering configuration is located. An example of how the filtering file should be formatted can be found here. |
- (Optional) Override the default values of the properties:
Run Configuration
Object Relationship
ETL Task Properties
| |||||||||||||||||||||||||||
(Optional) B. Configuring the advanced properties
You can configure the advanced properties to change the way the ETL works or to collect additional metrics
To configure the advanced properties:
- On the Add ETL page, click Advanced.
- Configure the following properties:
- Click Save.
The ETL tasks page shows the details of the newly configured Prometheus ETL:
Step III. Run the ETL
After you configure the ETL, you can run it to collect data. You can run the ETL in the following modes:
A. Simulation mode: Only validates connection to the data source, does not collect data. Use this mode when you want to run the ETL for the first time or after you make any changes to the ETL configuration.
B. Production mode: Collects data from the data source.
A. Running the ETL in simulation mode
To run the ETL in the simulation mode:
- In the console, navigate to Administration > ETL & System Tasks, and select ETL tasks.
- On the ETL tasks page, click the ETL. The ETL details are displayed.
- In the Run configurations table, click the pencil icon to modify the ETL configuration settings.
- On the Run configuration tab, ensure that the Execute in simulation mode option is set to Yes, and click Save.
- Click Run active configuration. A confirmation message about the ETL run job submission is displayed.
- On the ETL tasks page, check the ETL run status in the Last exit column.
OK Indicates that the ETL ran without any error. You are ready to run the ETL in the production mode. If the ETL run status is Warning, Error, or Failed:
- On the ETL tasks page, click the pencil icon in the last column of the ETL name row.
- Check the log and reconfigure the ETL if required.
- Run the ETL again.
- Repeat these steps until the ETL run status changes to OK.
B. Running the ETL in the production mode
You can run the ETL manually when required or schedule it to run at a specified time.
Running the ETL manually
- On the ETL tasks page, click the ETL. The ETL details are displayed.
- In the Run configurations table, click the pencil icon to modify the ETL configuration settings. The Edit run configuration page is displayed.
- On the Run configuration tab, select No for the Execute in simulation mode option, and click Save.
- To run the ETL immediately, click Run active configuration. A confirmation message about the ETL run job submission is displayed.
When the ETL is run, it collects data from the source and transfers it to the database.
Scheduling the ETL run
By default, the ETL is scheduled to run daily. You can customize this schedule by changing the frequency and period of running the ETL.
To configure the ETL run schedule:
- On the ETL tasks page, click the ETL, and click Edit Task. The ETL details are displayed.
On the Edit task page, do the following, and click Save:
- Specify a unique name and description for the ETL task.
- In the Maximum execution time before warning field, specify the duration for which the ETL must run before generating warnings or alerts, if any.
- Select a predefined or custom frequency for starting the ETL run. The default selection is Predefined.
- Select the task group and the scheduler to which you want to assign the ETL task.
- Click Schedule. A message confirming the scheduling job submission is displayed.
When the ETL runs as scheduled, it collects data from the source and transfers it to the database.
Step IV. Verify data collection
Verify that the ETL ran successfully and check whether the k8s Prometheus data is refreshed in the Workspace.
To verify whether the ETL ran successfully:
- In the console, click Administration > ETL and System Tasks > ETL tasks.
- In the Last exec time column corresponding to the ETL name, verify that the current date and time are displayed.
To verify that the k8s Prometheus data is refreshed:
- In the console, click Workspace.
- Expand (Domain name) > Systems > k8s Prometheus > Instances.
- In the left pane, verify that the hierarchy displays the new and updated Prometheus instances.
- Click a k8s Prometheus entity, and click the Metrics tab in the right pane.
- Check if the Last Activity column in the Configuration metrics and Performance metrics tables displays the current date.
K8s Prometheus Workplace Entities
| TSCO Entities | Prometheus Entity |
| Kubernetes Cluster | Cluster |
| Kubernetes Namespace | Namespace |
| Kubernetes Node | Node |
| Kubernetes Pod Workload | An aggregated group of pods running on the same controller; stand-alone static pods |
| Kubernetes Controller | DaemonSet, ReplicaSet, StatefulSet, ReplicationController |
| Kubernetes Consistent Volume | Consistent Volume, Consistent Volume Claim |
Entity Relationship
| TSCO Entities | Relationship Type | Description |
| Kubernetes Cluster | ROOTAPP | ROOTAPP |
| Kubernetes Namespace | KC_CONTAINS_KNS | Kubernetes Cluster contains Kubernetes Namespace |
| Kubernetes Node | KC_CONTAINS_KN | Kubernetes Cluster contains Kubernetes Node |
| Kubernetes Pod Workload | KNS_CONTAINS_PODWK | Kubernetes Namespace contains Pod Workload |
| Kubernetes Controller | KNS_CONTAINS_KD | Kubernetes Namespace contains Kubernetes Deployment |
| Kubernetes Persistent Volume | KC_CONTAINS_KPV | Kubernetes Cluster contains Kubernetes Persistent Volume |
Hierarchy
The connector is able to replicate relationships and logical dependencies among these entities as they are found configured within the Kubernetes cluster.
In particular, the following structure is applied:
- a Kubernetes Cluster is attached to the root of the hierarchy
- each Kubernetes Cluster contains its own Nodes, Namespaces and Persistent Volumes
- each Kubernetes Namespace contains its own Controllers and Stand along pod workloads
- each Kubernetes Namespace contains persistent volume via persistent volume claim relationship
- each Kubernetes Controller contains it pod workloads, which is an aggregated entitiy that contains a group of pods that are running on the same controller
Hierarchy showing Cluster > Namespace > Deployments > Pod Workload
Hierarchy Showing Cluster > PVs & Nodes
Lookup Field Considerations
| Entity Type | Strong Lookup Field | Others |
| Kubernetes - Cluster | KUBE_CLUSTER&&KUBE_TYPE | |
| Kubernetes - Namespace | KUBE_CLUSTER&&KUBE_TYPE&&KUBE_NS_NAME | |
| Kuberneteds - Node | KUBE_CLUSTER&&KUBE_TYPE&&HOSTNAME&&NAME | _COMPATIBILITY_ |
| Kubernetes - Controller | KUBE_CLUSTER&&KUBE_TYPE&&KUBE_NS_NAME&&KUBE_DP_NAME | |
| Kubernetes - Pod Workload | KUBE_CLUSTER&&KUBE_NS_NAME&&KUBE_DP_NAME&&KUBE_DP_TYPE&&KUBE_WL_NAME&&KUBE_TYPE | |
| Kubernetes - Persistent Volume | KUBE_CLUSTER&&KUBE_TYPE&&KUBE_PV_NAME |
Tag Mapping (Optional)
Here’s an example for what tag looks like:
k8s Heapster to k8s Prometheus Migration
The “Moviri Integrator for BMC Helix Continuous Optimization – k8s Prometheus” supports a seamless transition from entities and metrics imported by the “Moviri Integrator forBMC Helix Continuous Optimization – k8s Heapster”. Please follow these steps to migrate between the two integrators:
- Stop “Moviri Integrator for BMC Helix Continuous Optimization – k8s Heapster” ETL task.
- Install and configure the “Moviri Integrator for BMC Helix Continuous Optimization – k8s Prometheus”, ensuring that the lookup is shared with the “Moviri Integrator for BMC Helix Continuous Optimization – k8s Heapster” ETL task.
- Start “Moviri Integrator for BMC Helix Continuous Optimization – k8s Prometheus” ETL task.
Pod Optimization - Pod Workloads replace Pods
The “Moviri Integrator for BMC Helix Continuous Optimization – k8s Prometheus” introduce a new entity "pod workload" from v20.02.01. Pod Workload is an aggregated entity that aggregate a group of pods that are running on the same controller. pod workload is the direct child of the controller that the pods are running on. Pod workload will use the same name as the parent controller. Pods at the same time will be dropped.
Common Issues
Error Messages / Behaviors
|
|---|
Cause
Solve
| Query errors HTTP 422 Un-processable Entity | You will see these errors shows sometimes. The number of this error messages can vary a lot for each run. This is usually caused by Prometheus rebuilding or restarting. Right after the Prometheus's rebuilding or reloading, there are couple of days you will see this error showing. They usually goes away organically as the Prometheus running more stable. | They usually goes away organically as the Prometheus running more stable. |
| Prometheus is running fine but no data is pulled | This usually caused by the last counter is set too far from today's date. Prometheus has a data retention periods which has a default value 15 days, and it can be configured. If the ETL is set to extracting data passed the data retention period, there's not gonna be any data. Prometheus's status page will show the data retention value in "storage retention" field. | Modify the default last counter to a more recent date. |
| 504 Gateway Timeouts | These 504 Timeout query error (server didn't respond in time) is related to the route timeout is being used on Openshift. This can be configured on a route-to-route basis. For example, the Prometheus route can be increased to the 2min timeout that is also configured on the Prometheus backend. Please follow this link to understand what is the configured timeout and how can it be increased https://docs.openshift.com/container-platform/4.6/networking/routes/route-configuration.html | Increase timeout period from Openshift side |
Data Verification
The following sections provide some indications on how to verify on Prometheus if all the pre-requisites are in place before starting collecting data
Verify Prometheus Build information
Verify the Prometheus "Last successful configuration reload" (from Prometheus UI, check "Status > Runtime & Build Information")
If the "Last successful configuration reload" is reporting less then 3 days, ask the customer to evaluate the status of the integration in the next 2/3 days
Verify the Status of Prometheus Target Services
Verify the status of Prometheus Target (from the Prometheus UI, check "Status > Targets"
- Check the status of "node-exporter" (there should be 1 instance running for each node in the cluster)
- Check the status of "kube-state-metrics" (there should be at least 1 instance running)
- Check the status of "kubelet" (there should be at least 1 instance running for each node in the cluster)
Verify data availability in Prometheus Tables
Verify if the following Prometheus tables contain data (from the Prometheus Ul)
- "kube_pod_container_info" when missing Pod Workload, Controller, Namespace (but also Cluster and Node for Requests and Limits metrics)
- "kube_node info" when missing Node and Cluster metrics.