Performing ML-based root cause isolation of an impacted service

From the Services page, you drill down to the service details page to perform root cause isolation of an impacted service. 

As an operator, you can perform the following tasks:

  • View service details
  • Perform ML-based root cause isolation
  • Perform event actions for an impacted service
  • View CI topology
  • View service hierarchy
  • View health indicators
  • View the top metrics from causal nodes
  • View service insights

If you have the appropriate permissions, you can also edit or delete a service. 

To view the details of a service

Do one of the following actions to view the service details page:

    • Click the Overview tab, and from the Services widget, click any of the impacted services whose details you want to view.
    • Click the Services tab and click an individual service heat map or tile.
      The following MP4 shows how to view the service details page, the impacted event details, the event actions available for any event, and how to cross-launch the event details page in
      BMC Helix Operations Management:


      The following image provides more information about the details displayed for a service. 

      No.Description
      1

      Displays the service name, severity, incident ID associated with the service (if available), service impact score in percentage, service health score, and the date and time when the service was last updated.

      Click the link to launch the incident details page in BMC Helix IT Service Management – SmartIT (Must have permissions to view incidents inBMC Helix IT Service Management). 

      2

      Displays the top 3 impacted entities (business services) that are associated with the service.   

      3

      Pie chart displaying the count of open events impacting the service. The events are categorized by event status. The pie chart does not consider the INFO and OK events while displaying the event count. You can click the pie chart to view the list of all impacting events and additional event details.

      From the Event Details page, click More Details to cross-launch into BMC Helix Operations Management and view all the associated event details.

      4

      Displays a timeline for the service health over a selected time range. It also shows the health score for the selected time range. You can hover over a time slot to view the health score. The health timeline does not display the INFO and OK events.

      For more information, see Service health score, impact score, and metrics.

      Legends to indicate incidents, events, and change requests are displayed on the health timeline. Hover over an event, incident, or change request to view the details.

      For more information, see Total incident count and mean time to resolve (MTTR) indicators for a reliable incidence-response process


To view the ML-based root causes of an impacted service

  1. In the BMC Helix AIOps, do one of the following to view the root causes of an impacted service:
    • Click the Overview tab, and from the Services widget, click any of the impacted services whose details you want to view.
    • Click the Services tab, click an individual service tile.
      Service details page appears. 
  2. (Optional) To customize the columns that appear here, click the column selector and clear the columns that you do not want to appear. Only selected columns are displayed.
    You can also drag and drop the columns to rearrange them based on your requirement.  
  3. To view the class for an event, hover over the icon in the Class column. 
    The event class name is displayed as a tooltip. 
  4. To view the causal event details by causal nodes or situations, in the Root Cause Isolation tab, click View By and select one of the following options:
    • Causal Nodes: Displays the top 3 causal nodes impacting the service. Click a causal node and perform the following actions to view the event and change request details:
      • To view the event details:
        • Click Events to view top causal events. 
        • Hover over the score to view the score calculation details for the event.

        • Click on an event to view event details.

          Performance View tab for event details

          Performance View tab displays only for an alarm class event. It contains the time-series data collected from key attributes of the causal events.

        • Click More Details to launch the event details page in BMC Helix Operations Management. 
        • Click to perform any of the supported event actions.
          All logs and notes for an event are displayed.
        • Enter a note in the text box and click Add Note to add any additional notes related to the event.
          Any note added for the event is reflected for the event in BMC Helix Operations Management
      • To view the change details:
        • Select Changes to view top three change requests.
        • Hover-over the score to view the score calculation details for the change.
        • Click on a change to view change details.

      View all events or all changes

      • Click Show all events or Show all changes link to view all events or all changes for a particular causal node.
      • You can switch back to view only the top events or top changes, by clicking the Show top causal events or Show top causal changes link.
    • Situations: Displays the top 3 situations impacting the service. Click a situation to view the associated events. Click an event to view its details.  

      Launch the situation details page on the Situations tab

      Optionally, you can click the icon to launch the situation details page on the Situations tab. For more information, see Investigating ML-based situations.

  5. In the Incident ID column, if an incident is created, click the link to view the incident details in BMC Helix IT Service Management – SmartIT. 
    To launch the incident details page,
    you must have the permissions to view incidents in BMC Helix IT Service Management
  6. In the Automations column, automations that match the event are displayed.
    To run automations, see Remediating events for services and situations.
  7. Click Action and perform any of the available actions for the open events. 
    To perform actions, see To perform event actions for an impacted service.


To perform event actions for an impacted service

The capabilities available for your organization and your user role determine the event actions that you can perform against the open events. The following table describes the basic event actions.

Action

Description

Create Automation

Launches the BMC Helix Intelligent Automation > Create Automation Policy page to enable tenant administrators to create an automation policy.

Requires Intelligent Automations feature to be enabled from the Configurations > Manage Product Features page.

For more information, see Creating automation policies

Request Automation

Displays the Request Automation dialog box.

Requires Intelligent Automations feature to be enabled from the Configurations > Manage Product Features page.

For instructions on how to raise a request, see Requesting for a new automation.

Acknowledge Event

Recognizes the existence of an open event. This operation changes the event status from Open to Acknowledged.

Assign Event

Assigns ownership of an open, acknowledged, or assigned event to yourself or another person in the same account. This operation changes the event status from Open or Acknowledged to Assigned, and the event owner is updated with the selected user. If the event status is Assigned, only the ownership changes to the selected user.

Close Event

Disables any further event operations on the event. Closed events are not considered for calculating the status of a device.

You can close events with statuses Open, Assigned, and Acknowledged only.

Decline OwnershipRemoves ownership of an event in the assigned state. This operation changes the event status to Acknowledged.
Set Event PriorityAssigns a priority level to the event.
Take OwnershipAssigns ownership of Open or Acknowledged event to yourself.
Unknowledge EventChanges a previously Acknowledged event back to the Open state.
Add NotesDisplays the Add Notes dialog box.
Create Incident

Creates an incident in BMC Helix IT Service Management – SmartIT. The incident ID appears against the impacted nodes. You can click the link to to view the incident details in BMC Helix IT Service Management – SmartIT (Must have permissions to view incidents inBMC Helix IT Service Management).

For more information about the impact of the actions on the event, see Performing event operations Open link  in the BMC Helix Operations Management online documentation. 


To view the topological map of the service CIs

Click CI Topology to view the topological map of the service CIs and view the node details.

CI node and impact link display color

The CI topology nodes are displayed as per the node impact severity status and the CI impact path between the impacted nodes is marked with dotted red lines, and the non-impacted nodes is marked with grey lines as shown in this image.

    • (Optional) Use the various display options to maximize/minimize, drag or position, zoom in/out, and fit to center the topology map.
    • From the map, select any node to view the node details.
    • (Optional) Change the topology hierarchy, enable or disable aggregation by CI Kind.
    • (Optional) Modify the advanced filter to control the view of topology map.
      Based on the length of the selected criteria and available space to display, the filters are automatically tagged and grouped as +1 active, +2 active, and so on. You can click the tagged number to view the additional filters.


To view service hierarchy

  1. Click Service Hierarchy to view the service node details of parent and child services.
  2. Click Upstream Impact, Downstream Impact, or both to view the upstream (parent nodes) or downstream (child nodes) impact paths of the current service. 


To view health indicators

Click Health Indicators to view the health indicators configured for the service. By default, the charts are displayed for the last 24 hours. You have options to view the health indicators for the last 3, 6, 12, or 24 hours. For more information, see Health Indicators and Adding or editing health indicators for a service.


To view metrics for an impacted service

Click Metrics to view the metrics chart for the top attributes of the causal node. If there are more than three metrics, only the top three trending metrics are displayed.
Based on the metric data and its trend, you can take action to resolve the issue. For more information, see Service health score, impact score, and metrics.

To view service health insights

Click Insights to discover the service behavior and its severity pattern over a pre-defined period of 15 days as represented in the form text summaries and graphs. These insights help the operators in taking corrective measures to ensure service continuity. For more information, see Service health score, impact score, and metrics.

  • Service health behavior: For a service, the text summary shows the highest percentage degradation over a period and the graph represents the daily average health score trend for the predefined period. For example, see the following behavior pattern for consecutive four days period. 

  • Service severity pattern:  For a service, the summary text shows the daily occurrence time of Critical or Major severity, and the corresponding graph shows the severity occurrence pattern highlighted for the predefined period. For example, see following severity pattern with two repetitive durations on three consecutive days.




Was this page helpful? Yes No Submitting... Thank you

Comments