Performing probable cause analysis
To view the details of a service
In the BMC Helix AIOps, do one of the following actions to view the service details page:
- Click the Overview tab, and from the Services widget, click any of the impacted services whose details you want to view.
- Click the Services tab and click an individual service heat map or tile.
The following MP4 shows how to view the service details page, the impacted event details, the event actions available for any event, and how to cross-launch the event details page in BMC Helix Operations Management. - You can view the following details of each service:
No.
Description
1
Displays the service name, severity, incident ID associated with the service (if available), service impact score in percentage, service health score, and the date and time when the service was last updated.
Click the link to launch the incident details page in BMC Helix IT Service Management – SmartIT (Must have permissions to view incidents in BMC Helix IT Service Management).
2
Displays the top 3 impacted entities (business services) that are associated with the service.
3
Pie chart displaying the count of open events impacting the service. The events are categorized by event status. The pie chart does not consider the INFO and OK events while displaying the event count. You can click the pie chart to view the list of all impacting events and additional event details.
From the Event Details page, click More Details to cross-launch into BMC Helix Operations Management and view all the associated event details.
4
Displays a timeline for the service health over a selected time range. It also shows the health score for the selected time range. You can hover over a time slot to view the health score. The health timeline does not display the INFO and OK events.
For more information, see Service-health-score-impact-score-and-metrics.
Legends to indicate incidents, events, and change requests are displayed on the health timeline. Hover over an event, incident, or change request to view the details.
For more information, see Total-incident-count-and-mean-time-to-resolve-MTTR-indicators-for-a-reliable-incidence-response-process.
To view the probable causes of an impacted service
- In the BMC Helix AIOps, do one of the following to view the probable causes of an impacted service:
- Click the Overview tab, and from the Services widget, click any of the impacted services whose details you want to view.
- Click the Services tab, click an individual service tile.
Service details page appears.
- Navigate to the Probable Cause tab.
- From the Causal Nodes (% Probability), select a causal node to view the top causal events and changes for that node.
- (Optional) To customize the columns that appear here, click the column selector and clear the columns that you do not want to appear.
Only selected columns are displayed. You can also drag and drop the columns to rearrange them based on your requirement. - Do one of the following to view the event or change request details:
- To view the events and event details:
- Click Events to view top causal events.
- Hover over the score to view the score calculation details for the event.
- Click on an event message link to view event details.
- Click More Details to launch the event details page in BMC Helix Operations Management.
- Click
to perform any of the supported event actions.
All logs and notes for an event are displayed. - Enter a note in the text box and click Add Note to add any additional notes related to the event.
Any note added for the event is reflected for the event in BMC Helix Operations Management.
- To view the change details:
- Select Changes to view top three change requests.
- Hover-over the score to view the score calculation details for the change.
Click on a change to view change details.
Situations: Displays the top 3 situations impacting the service. Click a situation to view the associated events. Click an event to view its details.
- To view the events and event details:
- In the Incident ID column, if an incident is created, click the link to view the incident details in BMC Helix IT Service Management – SmartIT.
To launch the incident details page, you must have the permissions to view incidents in BMC Helix IT Service Management. In the Automations column, automations that match the event are displayed.
To run automations, see Remediating events for services and situations.- Click Action and perform any of the available actions for the open events.
To perform actions, see To perform event actions for an impacted service.
To perform event actions for an impacted service
The capabilities available for your organization and your user role determine the event actions that you can perform against the open events. The following table describes the basic event actions.
Action | Description |
---|---|
Create Automation | Launches the BMC Helix Intelligent Automation > Create Automation Policy page to enable tenant administrators to create an automation policy. Requires Intelligent Automations feature to be enabled from the Configurations > Manage Product Features page. For more information, see Creating automation policies. |
Request Automation | Displays the Request Automation dialog box. Requires Intelligent Automations feature to be enabled from the Configurations > Manage Product Features page. For instructions on how to raise a request, see Requesting for a new automation. |
Acknowledge Event | Recognizes the existence of an open event. This operation changes the event status from Open to Acknowledged. |
Assign Event | Assigns ownership of an open, acknowledged, or assigned event to yourself or another person in the same account. This operation changes the event status from Open or Acknowledged to Assigned, and the event owner is updated with the selected user. If the event status is Assigned, only the ownership changes to the selected user. |
Close Event | Disables any further event operations on the event. Closed events are not considered for calculating the status of a device. You can close events with statuses Open, Assigned, and Acknowledged only. |
Decline Ownership | Removes ownership of an event in the assigned state. This operation changes the event status to Acknowledged. |
Set Event Priority | Assigns a priority level to the event. |
Take Ownership | Assigns ownership of Open or Acknowledged event to yourself. |
Unknowledge Event | Changes a previously Acknowledged event back to the Open state. |
Add Notes | Displays the Add Notes dialog box. |
Create Incident | Creates an incident in BMC Helix IT Service Management – SmartIT. The incident ID appears against the impacted nodes. You can click the link to to view the incident details in BMC Helix IT Service Management(Must have the required permissions to view incidents). |
For more information about the impact of the actions on the event, see Performing-event-operations in the BMC Helix Operations Management online documentation.
To view the topological map of the service CIs
Click CI Topology to view the topological map of the service CIs and view the node details.
- (Optional) Use the various display options to maximize/minimize, drag or position, zoom in/out, and fit to center the topology map.
- From the map, select any node to view the node details.
- (Optional) Change the topology hierarchy, enable or disable aggregation by CI Kind.
- (Optional) Modify the advanced filter to control the view of topology map.
Based on the length of the selected criteria and available space to display, the filters are automatically tagged and grouped as +1 active, +2 active, and so on. You can click the tagged number to view the additional filters.
- (Optional) Use the various display options to maximize/minimize, drag or position, zoom in/out, and fit to center the topology map.
To view service hierarchy
- Click Service Hierarchy to view the service node details of parent and child services.
- Click Upstream Impact, Downstream Impact, or both to view the upstream (parent nodes) or downstream (child nodes) impact paths of the current service.
To view health indicators
Click Health Indicators to view the health indicators configured for the service. By default, the charts are displayed for the last 24 hours. You have options to view the health indicators for the last 3, 6, 12, or 24 hours. For more information, see Health Indicators and Adding or editing health indicators for a service.
To view metrics for an impacted service
Click Metrics to view the metrics chart for the top attributes of the causal node. If there are more than three metrics, only the top three trending metrics are displayed.
Based on the metric data and its trend, you can take action to resolve the issue. For more information, see Service-health-score-impact-score-and-metrics.
To discover Service-specific insights
Click Insights to discover the service behavior and its severity pattern over a pre-defined period as represented in the form text summaries and graphs. These insights help the operators in taking corrective measures to ensure service continuity.
- Service health behavior: For a service, the text summary shows the highest degradation percentage for two consecutive days and the graph represents the daily average health score trend within the predefined period. For example, see the following behavior pattern for consecutive four days period.
Let's consider an example of a Financial Service, for which the daily average health score and their percentage changes are described in the following table.
Date | Daily Avg. Health Score | % change in Daily Avg. Health Score, compared to previous day Formula: [(H2 - H1)/(H1)] x 100 where, H1 = Average Health Score of Previous Date H2 = Average Health Score of Current Date | |
---|---|---|---|
06/13/2022 | 62.50 | - | - |
06/14/2022 | 61.25 | [(61.25 - 62.5)/62.5] x 100 | - 2 % |
06/15/2022 | 60 | [(60-61.25)/61.25] x 100 | - 2.04 % |
06/16/2022 | 60.79 | [(60.79-60)/60] x 100 | + 1.31 % |
06/17/2022 | 59.86 | [(59.86-60.79)/60.79] x 100 | - 1.53 % |
06/18/2022 | 60 | [(60-59.86)/59.86] x 100 | + 0.23 % |
BMC Helix AIOps displays only the highest percentage degradation of average service health (e.g., 2.04%) in the summary text with the respective comparison dates. From the corresponding daily average health score trend, you can identify the zone of highest percentage degradation.
- Service severity pattern: : For a service, the text summary shows the duration for which the severities (Critical and Major) are periodically repeated and the corresponding graph shows the severity occurrences highlighted for the predefined period. For example, see following severity pattern with two repetitive durations on three consecutive days.
Consider the table below, showing the daily occurrences of Major and Critical severities of a Financial Service. Let's try to derive a pattern considering the periodical repetition of severities. We can see only the Major severity is occurring daily between 21:30 and 05:30 hours. However, based on the occurrences of Critical severity we can't derive any pattern.
Date | Severity | Duration |
---|---|---|
06/14/2022 | Major | 21:30 to 5:30 hrs |
Critical | 07:00 to 10:00 hrs | |
06/15/2022 | Major | 21:30 to 5:30 hrs |
Critical | 09:00 to 11:00 hrs | |
06/16/2022 | Major | 21:30 to 5:30 hrs |
Critical | 07:00 to 10:00 hrs |
BMC Helix AIOps displays the pattern for the occurrences of Major severity during the period. From the graph, viewing the highlighted sections you can identify the pattern. As per the example, we see the graph only for the Major severity, since it repeats regularly at a fixed time daily. However, there will be no graph for the Critical severity as the regularity pattern is broken on 06/15/2022.