OTel SLO Dashboard

Use the OTel SLO Dashboard to monitor service reliability against the defined Service Level Objectives (SLOs). The dashboard provides the following metrics to understand and track reliability risks:

Core indicators, such as Service Level Indicator (SLI), target SLO, remaining error budget, and latency-based slow requests
Operational metrics, such as total requests, successful requests, and error counts
Burn analytics such as total error budget, burned events, burn rate that indicate how quickly you are moving towards or away from the SLO breach

With this information, you can detect reliability issues early and prevent failed transactions, session drops, and churn.

To view the dashboard

Log in to BMC Helix Dashboards.
Expand the left navigation pane, and click Dashboards.
Search for the Helix OpenTelemetry Dashboards folder and click it.
Click OTEL SLO.
Use the following filters to view specific details:
- Business Service—Filter by a business service for which you want to track reliability.
- OTel Namespace—Filter by a namespace that logically groups telemetry data for related services or components.
- OTel Service—Filter by an OTel service that collects telemetry data from the applications.
- Target SLO—Specify a target SLO threshold that the service must meet. Target SLO is the predefined performance threshold used to determine whether a service meets its expected reliability level.
- Latency Target (ms)—Specify a latency (in milliseconds), the total time delay between a user initiating a request (e.g., clicking a link) and receiving the corresponding response from a server, usually measured in milliseconds.

Panels in the OTel SLO Dashboard

The following table describes the panels in the dashboard:

Panel	Description
Executive View
SLI	Displays the percentage of successful requests relative to the total number of requests. This metric indicates how well a service performs.
SLO	Displays the target threshold for a service that you select in the Target SLO filter.
Remaining Error Budget (Count)	Indicates the exact count of errors or slow requests the service can still endure before breaching the SLO. This metric specifies the acceptable failure before the SLO is breached.
Remaining Error Budget (%)	Displays the percentage of the total error budget that is still available to be consumed.
Burn Rate	Displays how fast the error budget is being consumed. For example, 1x = sustainable rate, >6x = active incident. With this rate, you can determine whether the current error level is sustainable or might lead to an SLO violation before the compliance window ends.
Operational Health
Total Operations	Displays the total number of requests that are received by a service.
Requests Without Errors	Displays the total number of requests that completed successfully without throwing an error code.
Requests With Errors	Displays the total number of requests that failed with an error status code.
Runway	Displays the number of days until the error budget is exhausted at the current 7-day rolling burn rate.
Slow requests	Displays the count of requests that took longer than the acceptable latency threshold. Slow requests affect performance, consume error budget, and risk SLO violation.
Average Latency (P99)	Displays the 99th percentile response time across the service. 99 percent (%) of all requests complete faster than this value.
Burn & Budget Details
Total Error Budget	Indicates the maximum instances your service can fail (in terms of time or errors) before its SLO is breached.
Burned Events	Displays the total count of requests that either returned an error or exceeded the latency threshold.
Budget Burn 5 Min	Displays the percentage of total error budget consumed by failures in the last 5 minutes. It indicates how quickly a service consumed its budget over the five-minute period.
Budget Burn 60 Min	Displays the percentage of total error budget consumed by failures in the last 60 minutes. It indicates how quickly a service consumed its budget during the last hour and whether this consumption rate is sustainable within the SLO period.
Budget Burn 360 Min	Displays the percentage of total error budget consumed by failures in the last 360 minutes. It indicates how quickly a service consumed its budget during the last 360 minutes and whether this consumption rate is sustainable within the SLO period.
Error Budget Burn Down	Displays the graphical presentation of the remaining error budget over time.
Budget Consumption Rate	Displays the ratio of failures in the last hour against the average hourly failures allowed by your SLO.
Top 10 Slowest Endpoints	Displays a tabular view of the top 10 slowest endpoints, with added fields for total requests, errors, and maximum latency.

OTel SLO Dashboard

To view the dashboard

Panels in the OTel SLO Dashboard

BMC Helix Dashboards 26.1

On this page