Kubernetes view

The Kubernetes view enables you to manage the capacity and efficiency of containerized environments managed by the Kubernetes platform, and presents key capacity metrics and charts for Kubernetes clusters, nodes, namespaces, deployments and pods.

You can use the Kubernetes view to complete tasks such as:

  • Understand resource bottlenecks and aggregate residual capacity of Kubernetes clusters as well as individual nodes
  • Detect current or imminent resource saturation conditions and days before the resource is saturated for every major Kubernetes resource (e.g. cluster, nodes, deployments, pods)
  • Assess the level of infrastructure efficiency, by comparing allocated vs actually resources, and identify most wasteful deployments or pods
  • Identify application in resource usage patterns and detect resource shortage conditions
  • Characterize the footprint of infrastructure resources on a containers' image basis
  • Understand resource utilization of namespaces, including the level of usage of resource quotas

Videos

The following video (5:35) provides a brief introduction of the Kubernetes views.

https://youtu.be/2LlBvw_zzDk



The following video (9.53) provides information about how to access and use the Kubernetes views.

https://youtu.be/yyAIi8_DMkM


Requirements

Supported versions of TrueSight Capacity Optimization

  • Supported TrueSight Capacity Optimization versions: 10.7.01 onward

Conventions

The Kubernetes view provides summarized, high-level capacity KPIs designed for capacity management.The following common conventions around naming and metrics aggregation are valid for all data presented in the view:

  • Standard table column names: Metric name [unit of measurement] (ex. "Memory [GB]")
  • Unit of measurement is omitted when implicitly evident
  • The metric value is the aggregation of the last 30 days
  • The metric value is computed as follows: for each day, the daily peak is considered (at hourly resolution). Then, the mean value of the daily peak over the last 30 days is shown.

This is only valid for summary metrics presented in tables and overall page. The charts presented in the details pages follow regular over time metrics semantics, whose time frame and time resolution available as filters in the top of the page.

View Structure

The Kubernetes view is composed of the following first-level pages:

  • Overview: it presents a summary of Kubernetes components states
  • Clusters: it shows capacity metrics for Kubernetes clusters
  • Nodes: it shows capacity metrics for Kubernetes nodes
  • Namespaces: it shows capacity metrics for Kubernetes namespaces
  • Deployments: it shows capacity metrics for Kubernetes deployments
  • Pods: it shows capacity metrics for Kubernetes pods


For each page except Overview, a set of second-level tabs cover information about some or all of the following parts:

  • Capacity: summary of the most important capacity indicators
  • CPU: the most relevant CPU configuration and performance metrics
  • Memory: the most relevant memory configuration and performance metrics
  • Storage: the most relevant storage configuration and performance metrics (this tab is defined only in the Cluster page)


From all of the second-level tabs it is possible to drill-down to an entity detail page, which presents the most relevant performance metrics as time charts and tables for the most important configuration properties.

Overview

The goal of the overview page is to provide at-a-glance aggregated capacity visibility over all of the main Kubernetes components.

For each component three doughnut graphs are represented, showing the number of entities of the specific component based on the corresponding capacity status:

  • Ok: the component is healthy from a capacity management perspective 
  • Warning: the component has breached (or will be breaching in the near future) a warning utilization threshold for one or more metrics
  • Alert: the component has breached (or will be breaching in the near future) a critical utilization threshold for one or more metrics

Please refer to the section Threshold, Bottleneck & Status below for more details on how the capacity status is calculated for each entity. 

As we can see from the picture below each doughnut:

  • is scaled with the number of components taken in account;
  • has a color dependent on the component capacity status;
  • can be moved in the "Favorites" tabs clicking on the associated star.

Entity Pages

These first level pages (Clusters, Nodes, Namespaces, Deployments and Pods) are designed to provide capacity management insights for the corresponding Kubernetes entity. Each page contains four different second-level tabs (Capacity, CPU, memory and Storage) each of which presents capacity and efficiency KPIs related to the resources relevant for the analyzed entity.

Capacity

The first tab "Capacity" is designed to provide a summary of the most important KPIs for managing capacity of Kubernetes evironments.

The table below summarize the information that is provided by the tab. Depending on the particular entity that is being analyzed (e.g. Cluster or Pod), only the relevant set of columns is shown. For example, the "Deployment" column is only shown for Pods, while columns related to quotas like "Mem Request vs Quota" are only shown for Namespaces.

Column name

Meaning

Cluster

Name of the Cluster

Node

Name of the Node

Namespace

Name of the Namespace

Deployment

Name of the deployment

Pod

Name of the pod

Pod #

Number of pods

Status

Indicator of the resource's status (*)

CPU Used vs Cap [%]

Percentage of CPUs actually used with respect to CPU capacity (number of CPUs)

CPU Request vs Cap [%]

Percentage of CPUs requested with respect to CPU capacity (number of CPUs)

CPU Request vs Quota [%]

Percentage of CPU request with respect to the namespace CPU request quota

CPU Limit vs Quota [%]

Percentage of CPU limit with respect to the CPU limit quota

CPU Used vs Limit [%]

Percentage of CPU used with respect to the CPU limit

Mem Used vs Cap [%]

Percentage of memory actually used with respect to memory capacity (total memory)

Mem Request vs Cap [%]

Percentage of memory requested with respect to memory capacity (total memory)

Mem Request vs Quota [%]

Percentage of memory request with respect to the namespace memory request quota

Mem Limit vs Quota [%]

Percentage of memory limit with respect to the namespace memory limit quota

Mem Used vs Limit [%]

Percentage of memory used with respect to the memory limit

Pod # vs Pod Max [%]

Percentage of pod already created

Spare Pods

Estimated number of residual capacity in terms of additional pods that can be scheduled. This considers the average size of existing pods in the cluster

Bottleneck Resource

First resource to saturate

The picture below is an example of Capacity tab.

CPU

The CPU tab presents the most important metrics related to the CPU. The column definition is defined in the table below.

Column name

Meaning

Cluster

Name of the cluster

Node

Name of the node

Namespace

Name of the namespace

Deployment

Name of the deployment

Pod

Name of the pod

Status

Resource capacity status

Days To Saturation

Days before the resource is saturated

CPU #

Number of CPUs

CPU Used

Amount of CPU used

CPU Request

Amount of CPU request

CPU Limit

Amount of CPU limit

Quota Request

Quota set for the CPU request resource

Quota Limit

Quota set for the CPU limit resource

CPU Used vs Request [%]

Percentage of CPU used with respect to the CPU request

CPU Used vs Limit [%]

Percentage of CPU used with respect to the CPU limit

CPU Used vs Cap [%]

Percentage of CPUs actually used with respect to CPU capacity (number of CPUs)

CPU Request vs Cap [%]

Percentage of CPUs requested with respect to CPU capacity (number of CPUs)

CPU Request vs Quota [%]

Percentage of CPU request with respect to namespace CPU request quota

CPU Limit vs Quota [%]

Percentage of CPU limit with respect to the namespace CPU limit quota

CPU Overcommitment [%]

Percentage of CPU limit with respect to the CPU capacity (number of CPUs)

Bottleneck

First resource to saturate


Memory

The Memory tabs presents the most important metrics related to the memory. The column definition is defined in the table below.

Column name

Meaning

Cluster

Name of the Cluster

Node

Name of the Node

Namespace

Name of the Namespace

Deployment

Name of the deployment

Pod

Name of the pod

Status

Indicator of the resource's status

Days To Saturation

Days before the saturation of the physical resources

Memory

Amount of memory

Mem Used

Amount of memory used

Mem Request

Amount of memory request

Mem Limit

Amount of memory limit

Quota Request

Amount of memory quota request

Quota Limit

Amount of memory quota limit

Mem Used vs Request [%]

Percentage of memory used with respect to memory request

Mem Used vs Limit [%]

Percentage of memory used with respect to memory limit

Mem Used vs Cap [%]

Percentage of memory used with respect to memory capacity (total memory)

Mem Request vs Cap [%]

Percentage of memory request with respect to memory capacity (total memory)

Mem Request vs Quota [%]

Percentage of memory request with respect to memory quota

Mem Limit vs Quota [%]

Percentage of memory limit with respect to memory quota

Mem Overcommitment [%]

Percentage of memory limit with respect to memory capacity (total memory)

Bottleneck

First resource to saturate

Storage

Since in Kubernetes the persistent volumes can be requested only from by the Cluster, the cluster tab is the only one with this subsection. Since the persistent volume management is still at the early stages of the developments by Kubernetes, the only relevant information available up to now are the following.

Column name

Meaning

Cluster

Name of the Cluster

Number of PV

Number of persistent volumes

PV Capacity [GB]

Storage capacity aggregated across all of the configured persistent volumes

PV Allocated [GB]

Storage allocated space aggregated across all of the configured persistent volumes

PV Free [GB]

Storage free space aggregated across all of the configured persistent volumes

PV Allocated vs Capacity [%]

Percentage of storage allocated with respect to capacity


Details page

Clicking on the name of one resource you will arrive in the Details page. In this dashboard, as shown in the picture below, you can find the time charts of the most important metrics, related to the system take into account and a table with the configuration metrics.


For more details on how the TrueSight metrics map to the Kubernetes ones and how the derived metrics are computed, please refer to the Kubernetes integrator documentation.

Thresholds, Status & Bottleneck

The table below reports the thresholds used in the view.
The column Resource is defined as: "Resource taken into account: status"; the defined resources are:

  • CPU: all the metrics related to the CPU (ex. CPU utilization or CPU Request over CPU Number)
  • MEM: all the metrics related to the memory (ex. memory utilization or memory Request over memory capacity)
  • SAT: days before saturation of the resource
  • POD: percentage of pods required over pod creation limit (this limit depends from the Kubernetes version installed)

Resource

Cluster

Node

Namespace

Deployment

Pod

CPU: OK

< 80%

< 80%

< 80%

< 80%

< 80%

CPU: ALERT

> 90%

> 90%

> 90%

> 90%

> 90%

MEM: OK

< 75%

< 75%

< 75%

< 75%

< 75%

MEM: ALERT

> 85%

> 85%

> 85%

> 85%

> 85%

SAT: OK

> 90 days

> 90 days

> 90 days

> 90 days

> 90 days

SAT: ALERT

< 30 days

< 30 days

< 30 days

< 30 days

< 30 days

POD: OK

< 70%

< 70%

< 70%

< 70%

< 70%

POD: ALERT

> 80%

> 80%

> 80%

> 80%

> 80%

 The status is evaluated as the worst status among the following resources and the bottleneck is the worst resource associated to that state. 

Resource

Bottleneck

CPU/Memory Utilization

CPU/MEM:USED

CPU/Memory Request on Capacity

CPU/MEM:REQUEST

CPU/Memory Limit on Quota

CPU/MEM:QUOTA

CPU/Memory Request on Quota

CPU/MEM:QUOTA

CPU/Memory Used on Limit

CPU/MEM:QUOTA

Pod number

POD:NUM
Days to saturationCPU/MEM:USED


Please note CPU/Memory Used on Request metrics are not considered in the status and bottleneck evaluation since used is allowed to be greater than requested, hence they can go above 100%.

Materializer Task

The Kubernetes view takes advantage of TrueSight Capacity Optimization materialized data marts in order to enable faster page loading times during user browsing, by pre-computing the underlying data. To achieve this result, a proper DataMartMaterializer task is deployed as part of the view installation process. By default, the task is scheduled to run every day at midnight. Please modify the scheduling parameters to fit the environment needs, such as:

  • Data loading scheduling and warehouse latency
  • Multiple refresh in the same day
  • Kubernetes View user need

The Kubernetes View materializer is shown in the next figure and can be found under System Tasks.



Was this page helpful? Yes No Submitting... Thank you

Comments