Platform Observability dashboard


As a tenant administrator, use the Platform Observability dashboard to analyze and optimize the ingestion of traces, metrics, logs, and events data to manage observability and costs. The dashboard displays this ingested data according to business services, which are mapped to the business owners and teams responsible for managing the data ingestion.

You can work with these business owners and teams to manage observability and costs. For example, the Log Resource Usage section of the dashboard provides information about the volume of ingested log data. Analyze this data and optimize the log storage, which is a significant contributor to operational expenses. Additionally, you can configure thresholds on this data to send proactive notifications to the business owners and teams. 

The dashboard provides the top services based on their ingestion rate for each type of telemetry data, such as events, topology, and OpenTelemetry.

To view the dashboard

  1. From the navigation menu, click Dashboards.
  2. Search for the AIOps Observability folder and select it.
  3. Click Platform Observability.
    The dashboard is displayed.

    platform_observability_dashboard.png
     

Panels in the Platform Observability dashboard

PanelDescription
Event Resource Usage
Total Events IngestedDisplays the total number of events ingested during the selected interval.
Event Ingestion DistributionDisplays the pie chart containing the events that impact a business service and the events that do not impact the business service. 
Top 5 Services By Event Ingestion Displays a list of top five business services by event ingestion.
CI Resource Usage
Top 5 CIs By Metric IngestionDisplays the top five configuration items (CIs) based on the volume of performance metrics or data that they ingest in the platform or Large Language Model (LLM) system. 
Top 5 Sources By Metric IngestionDisplays the top five sources based on the volume of performance metrics or data that they ingest in the platform or LLM system. 
CI Service AssociationDisplays the CIs that are associated with a service and CIs that are not associated with a service. Click the number in the Associated column corresponding to a CI type to view the CI details.
OTel Resource Usage
OTel Operations IngestedDisplays the total number of OTel operations ingested in the platform or LLM system during a selected interval.
OTel Traces IngestedDisplays the total number of OTel traces ingested in the platform or LLM system during a selected interval. 
Top 5 Services By OTel Operation Count  Displays the top five business services by OTel operation count.
Log Resource Usage 
Volume of Ingested LogsDisplays the volume of ingested logs during a selected interval. 
Log Event DistributionDisplays the pie chart containing log events that impact a business service and log events that do not impact the business service.
Top 5 Services By Log Event IngestionDisplays the top five business services by log event ingestion. Click a business service name or a number of events to view the business service details.
Daily Log IngestionDisplays the bar charts that indicate the volume of logs ingested daily (in MB).

Top 5 Hosts by Log Volume Associated With a Service

Displays the bar charts that indicate the top five hosts according to the volume of log ingestion associated with a business service.
Top 5 Hosts by Log Volume Not Associated With a ServiceDisplays the bar charts indicating the top five hosts by the volume of log ingestion that is not associated with a business service.
Top 5 Services by Log Record Count Displays the bar charts indicating the top five business services based on the log record count.
Top 5 Connectors by Log VolumeDisplays the bar charts indicating the top five connectors based on the log volume.
Top 5 Hosts by Log Record Count Displays the bar charts indicating the top five hosts based on the log record count.
Top 5 Hosts by Log VolumeDisplays the bar charts indicating the top five hosts based on the log volume.

 

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*