Performing ML-based root cause isolation of an impacted service
To view the details of a service
Do one of the following actions to view the service details page:
- Click the Overview tab, and from the Services widget, click any of the impacted services whose details you want to view.
Click the Services tab and click an individual service heat map or tile.
The following MP4 shows how to view the service details page, the impacted event details, the event actions available for any event, and how to cross-launch the event details page in BMC Helix Operations Management:
The following image provides more information about the details displayed for a service.No.
Description
1
Displays the service name, severity, incident ID associated with the service (if available), service impact score in percentage, service health score, and the date and time when the service was last updated.
Click the link to launch the incident details page in BMC Helix IT Service Management – SmartIT (Must have permissions to view incidents inBMC Helix IT Service Management).
2
Displays the top 3 impacted entities (business services) that are associated with the service.
3
Pie chart displaying the count of open events impacting the service. The events are categorized by event status. The pie chart does not consider the INFO and OK events while displaying the event count. You can click the pie chart to view the list of all impacting events, Situations, changes, and incidents for the service. Currently, only 10000 events are displayed for any service.
From the Event Details page, click More Details to cross-launch into BMC Helix Operations Management and view all the associated event details.
4
Displays a timeline for the service health over a selected time range. It also shows the health score for the selected time range. You can hover over a time slot to view the health score. The health timeline does not display the INFO and OK events.
For more information, see Service-health-score-impact-score-and-metrics.
Legends to indicate incidents, events, and change requests are displayed on the health timeline. Hover over an event, incident, or change request to view the details.
For more information, see Total-incident-count-and-mean-time-to-resolve-MTTR-indicators-for-a-reliable-incidence-response-process.
To view the ML-based root causes of an impacted service
- In the BMC Helix AIOps, do one of the following to view the root causes of an impacted service:
- Click the Overview tab, and from the Services widget, click any of the impacted services whose details you want to view.
- Click the Services tab, click an individual service tile.
Service details page appears.
- (Optional) To customize the columns that appear here, click the column selector and clear the columns that you do not want to appear. Only selected columns are displayed.
You can also drag and drop the columns to rearrange them based on your requirement. - To view the class for an event, hover over the icon in the Class column.
The event class name is displayed as a tooltip. - To view the causal event details by causal nodes or situations, in the Root Cause Isolation tab, click View By and select one of the following options:
Causal Nodes: Displays the top 3 causal nodes impacting the service. Click a causal node and perform the following actions to view the event and change request details:
- To view the event details:
- Click Events to view top causal events.
- Hover over the score to view the score calculation details for the event.
Click on an event to view event details.
- Click More Details to launch the event details page in BMC Helix Operations Management.
- Click
to perform any of the supported event actions.
All logs and notes for an event are displayed. - Enter a note in the text box and click Add Note to add any additional notes related to the event.
Any note added for the event is reflected for the event in BMC Helix Operations Management.
- To view the change details:
- Select Changes to view top three change requests.
- Hover-over the score to view the score calculation.
- Click on a change to view change details.
Situations: Displays the top 3 situations impacting the service. Click a situation to view the associated events. Click an event to view its details.
- In the Incident ID column, if an incident is created, click the link to view the incident details in BMC Helix IT Service Management – SmartIT.
To launch the incident details page, you must have the permissions to view incidents in BMC Helix IT Service Management. - In the Automations column, automations that match the event are displayed.
To run automations, see Remediating-events-for-services-and-situations. - Click Action and perform any of the available actions for the open events.
To perform actions, see To perform event actions for an impacted service.
To perform event actions for an impacted service
The capabilities available for your organization and your user role determine the event actions that you can perform against the open events. The following table describes the basic event actions.
Action | Description |
---|---|
Create Automation | Launches the BMC Helix Intelligent Automation > Create Automation Policy page to enable tenant administrators to create an automation policy. Requires Intelligent Automations feature to be enabled from the Configurations > Manage Product Features page. For more information, see Creating-automation-policies. |
Request Automation | Displays the Request Automation dialog box. Requires Intelligent Automations feature to be enabled from the Configurations > Manage Product Features page. For instructions on how to raise a request, see Requesting-for-a-new-automation. |
Trigger Automation | Displays the Run Automation dialog box that you can use to run automations for remediating the event. Requires Intelligent Automations feature to be enabled from the Configurations > Manage Product Features page. |
Acknowledge Event | Recognizes the existence of an open event. This operation changes the event status from Open to Acknowledged. |
Assign Event | Assigns ownership of an open, acknowledged, or assigned event to yourself or another person in the same account. This operation changes the event status from Open or Acknowledged to Assigned, and the event owner is updated with the selected user. If the event status is Assigned, only the ownership changes to the selected user. |
Close Event | Disables any further event operations on the event. Closed events are not considered for calculating the status of a device. You can close events with statuses Open, Assigned, and Acknowledged only. |
Decline Ownership | Removes ownership of an event in the assigned state. This operation changes the event status to Acknowledged. |
Set Event Priority | Assigns a priority level to the event. |
Take Ownership | Assigns ownership of Open or Acknowledged event to yourself. |
Unknowledge Event | Changes a previously Acknowledged event back to the Open state. |
Add Notes | Displays the Add Notes dialog box. |
Create Incident | Creates an incident in BMC Helix IT Service Management – SmartIT. The incident ID appears against the impacted nodes. You can click the link to to view the incident details in BMC Helix IT Service Management – SmartIT (Must have permissions to view incidents inBMC Helix IT Service Management). |
For more information about the impact of the actions on the event, see Performing event operations in the BMC Helix Operations Management online documentation.
To view the topological map of the service CIs
Click CI Topology to view the topological map of the service CIs and view the node details.
- (Optional) Use the various display options to maximize/minimize, drag or position, zoom in/out, and fit to center the topology map.
- From the map, select any node to view the node details.
- (Optional) Change the topology hierarchy, enable or disable aggregation by CI Kind.
- (Optional) Modify the advanced filter to control the view of topology map.
Based on the length of the selected criteria and available space to display, the filters are automatically tagged and grouped as +1 active, +2 active, and so on. You can click the tagged number to view the additional filters.
- (Optional) Use the various display options to maximize/minimize, drag or position, zoom in/out, and fit to center the topology map.
- (Optional) In the topological map, 10 or more CIs of same kind are automatically grouped together. You need to expand the groups one after another to view the CIs. As example, consider a set of 15 CIs of same kind that are grouped together. After expanding the group, you can view 9 CIs and another group, which you need to expand again to view the remaining 6 CIs.
- (Optional) In the topological map, 10 or more CIs of same kind are automatically grouped together. You need to expand the groups one after another to view the CIs. As example, consider a set of 15 CIs of same kind that are grouped together. After expanding the group, you can view 9 CIs and another group, which you need to expand again to view the remaining 6 CIs.
To view service hierarchy
- Click Service Hierarchy to view the service node details of parent and child services.
- Click Upstream Hierarchy or Downstream Hierarchy or both to view the upstream (parent nodes) or downstream (child nodes) service hierarchy of the current service.
To view health indicators
Click Health Indicators to view the health indicators configured for the service. By default, the charts are displayed for the last 24 hours. You have options to view the health indicators for the last 3, 6, 12, or 24 hours. For more information, see Health Indicators and Adding or editing health indicators for a service.
To view metrics for an impacted service
Click Metrics to view the metrics chart for the top attributes of the causal node. If there are more than three metrics, only the top three trending metrics are displayed.
Based on the metric data and its trend, you can take action to resolve the issue. For more information, see Service-health-score-impact-score-and-metrics.
To view insights for service health, events, and incidents
Click Insights to discover the service behavior and its severity pattern over a pre-defined period of 15 days. The insights are represented in the form of text summaries and their corresponding graphs. These insights help the operators in taking corrective measures to ensure service continuity. For more information, see Service-health-score-impact-score-and-metrics.
- Health Score: You can see the trend of service health over a period, with latest percentage degradation in health score between two subsequent dates.
- Severity pattern: You can see whether any severity (Critical or Major) is affecting a service everyday during a specific time.
- Major and Critical Events: Occurrences of events are inversely correlated with service health. Increase in occurrences of Critical/Major events impacts a service by reducing its health score. Insights are available if there is an increasing trend of Major or Critical events over the period. You can see the trend of Critical or Major events over a period with average event occurrences and latest number of increase in events between two subsequent dates.
- Insights for Incidents: Incidents are raised against the events associated to a service, and processed to BMC Helix Operations Management as Incident Info (or INCIDENT_INFO) event class. These Incident Info events are used to derive incidents-related insights in BMC Helix AIOps. Insights are available if there is an increasing trend of incidents events over the period. You can see the trend of incidents over a period with average incident occurrences and latest number of increase in incidents between two subsequent dates.