OTel Trace Details dashboard


In a microservices-based environment, one of the challenges that organizations face is to track requests that flow through the microservices. Also, whenever a bottleneck or issue occurs, it becomes difficult to track the issue, which can affect the performance of the system and might result in downtime. Traces from application services enable users to identify the operations that cause such issues.

Use the OTel Trace Details dashboard to track and analyze traces to understand the flow of requests and responses across the services and components. With this analysis, you can identify bottlenecks, latency issues, and other performance problems to improve and optimize the system performance.

Important

The dashboard is available only if the BMC Helix OpenTelemetry service is enabled for your tenant. To enable this service, contact BMC Support.

The dashboard provides the following information about traces for the selected service:

  • Top five operations with high latency
  • Top five operation calls
  • Trace details, such as ID, duration, operations, and user-defined attributes list
Scenario

Jim is an operator at Apex Global. He is responsible for monitoring the health of business services in the microservices-based environment. On the OTel Service Overview dashboard, Jim observes that the duration for one of the services is showing in red. He clicks the duration and navigates to the OTel Trace Details dashboard. Under Service and Operation, he expands the operation, identifies the sub-operations that cause the delay, and informs the SRE to fix them. The SRE fixes the issues and the service is restored to its normal state.

To view the dashboard

  1. From the navigation menu menu_icon.png, click Dashboards.
  2. Search for the Helix OpenTelemetry folder and select it.
  3. Click OTel Trace Details.
    The dashboard is displayed.

    otel_trace_details_dashboard.png

  4. From the Business Service list, select a business service.
    The list shows the business services for which OpenTelemetry is enabled.
  5. From the OTel Namespace list, select a namespace.
  6. From the OTel Service list, select an application service.
  7. From the Trace list, select a trace ID.
  8. (Optional) Change the date range for the data displayed in the dashboard; the default is three hours.
  9. Review the trace data of the service in the dashboard panels.

Tip: Quick access from the Home page

To quickly open the dashboard from the Home page, mark it as a favorite by using the star icon. Additionally, after you open a dashboard, it is available under Recently viewed dashboards on the Home page.

Panels in the OTel Trace Details dashboard

Panel

Description

Top 5 Operations by Latency

Displays the top five operations according to the descending order of their latency (in ms). The operation with the highest latency is displayed at the top of the list. 

Top 5 Operations by Times Called

Displays the top five operations according to the number of times when they are called. The operation with the highest number of calls is displayed at the top of the list. 

Operation Metrics

Displays the number of times the operations were called and the maximum time (in ms) that is taken to complete the operations.

Details for Trace ID

Displays the following details of the trace:

  • The trace start and end time, duration, number of services, operation depth, and operations
  • The time taken by each service in the trace to process the request
  • Hierarchical sequence of all the services through which the request passed.

Under Service and Operation, click a service name to view the following information, which helps you to identify and troubleshoot issues when an operation is in an error state.

  • Name, duration, start time, and the count of child services
  • Attributes, such as host name, port, and status code
  • Resource details, such as Kubernetes container ID, deployment name, pod name, telemetry name, and SDK name.

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*