Investigating ML-based situations
To investigate a primary situation
A primary situation consists of a group of open situations that occurred due to a similar issue and impacted multiple services across the service hierarchy. Instead of troubleshooting each service and its situation separately, operators or SREs can investigate the primary situation and take relevant actions, which helps reduce situation noise.
- On the BMC Helix AIOps console, click Situations.
All situations that occurred in the last 24 hours are displayed in a hierarchical view. Primary situations are indicated by theicon.
To learn more about primary situations, see Situations-overview. - Expand the primary situation group to view all related situations and identify the root cause situations.
A root cause situation is indicated by the target icon. There can be multiple root cause situations under a primary situation. - Click the primary situation and view the following details on the situation details page:
- Situation name, severity, priority, last modified date, and status
- Incident ID: Click to open the incident in BMC Helix IT Service Management.
If an incident is not created, a Create Incident link is displayed. Click the link to create an incident in BMC Helix IT Service Management (requires a subscription to BMC Helix IT Service Management). - Name of the impacted service: Click the name to open the service details in a new tab.
- List of related situations
- In the Related Situations section, identify the root cause situation (indicated by the
icon) and analyze the attributes such as time of occurrence, number of similar situations, number of related events, type, severity, priority, status, and incident ID.
- (Optional) Click the situation name or the related events link to open the situation in a new tab.
- (Optional) Click the Incident ID link to open the incident in BMC Helix IT Service Management.
- (Optional) Click the action menu
to perform actions on a situation.
For more information, see Performing-situation-actions.
- Continue with To investigate an independent situation.
To investigate an independent situation
- Click an open, independent situation and view the following details:
Situation name, severity, priority, incident ID, status, and the name of the impacted service.
If Proactive Service Resolution is enabled in BMC Helix Intelligent Automation, the same incident ID is displayed against the situation and the events. If an incident is not created, the Create Incident option is displayed.- Situation highlight: Number of events from the top three impacted hosts.
- Similar situations (If more than one similar situation is available).
- Situation explanation
- CI topology and analysis
- If BMC HelixGPT is enabled:
- A human-readable AI-generated summary of the situation.
- Best action recommendations, a list of suggested steps that can be used to remediate the situation. Additionally, a BMC HelixGPT-driven wizard offers sample code to accomplish individual steps in different languages such as Ansible, Python, and Bash.
- Log insights collected from logs generated in BMC Helix Log Analytics that help in getting an accurate root cause of the problem.
An integrated virtual agent, Ask HelixGPT, which leverages the BMC HelixGPT generative AI capabilities and helps you to ask questions to investigate and remediate the situation better. To learn more about BMC HelixGPT capabilities, see Situations-overview.
- Continue with To view best action recommendations.
To view best action recommendations
Best action recommendations are available if BMC HelixGPT is enabled. To enable BMC HelixGPT, contact BMC Support.
On the situation details page, review an AI-generated summary (short problem statement, brief summary, and detailed problem context).
- Click Show remediation steps.
The recommended steps are displayed for a situation.
For example, for a High CPU Utilization issue, the following steps are suggested: - (If available) Click Code wizard.
The code that can be used to run the recommended step is displayed. For some manual steps, the code wizard might not be displayed.- Select your preferred language (Ansible, Python, Bash), and the code is displayed based on the selected language.
- Click Copy to clipboard and use the code in your existing script to run the recommended remediation step.
- Close the code wizard.
- Continue with To view log insights.
To view log insights
Log insights are available if BMC HelixGPT is enabled. To enable BMC HelixGPT, contact BMC Support.
BMC HelixGPT connects with your log repository, including BMC Helix Log Analytics, and shows an in-depth analysis of the time-sliced runtime logs from diverse systems to identify the root cause of the situations.
For a situation, operators or SREs can view the BMC HelixGPT-generated summary of the logs in the Log Insights section, which helps in identifying the root cause of the situation. Use the cross-launch link to view the log details in BMC Helix Log Analytics.
To use the Ask HelixGPT virtual agent to get more information about the situation
The Ask HelixGPT virtual agent is available if BMC HelixGPT is enabled. To enable BMC HelixGPT, contact BMC Support.
Use the Ask HelixGPT virtual agent to ask questions within the context of an open or past situation. Using the BMC HelixGPT capabilities, operators can get information about diverse topics regarding infrastructure, service health, and near real-time predictions.
- Click Ask HelixGPT.
The interactive virtual agent dialog box displays the following predefined questions:- What is the impact of the issue?
- Which team can solve this issue?
- Has this situation happened in the past?
- Are there any change windows active during this situation?
- Click any question to get additional information about the situation.
BMC HelixGPT generates the answer by evaluating information from the incidents created for similar situations in BMC Helix IT Service Management, analyzing time stamps and patterns of similar situations that have occurred in the past, analyzing the service health score of the impacted service of the situation, and the change requests associated with the situation.
For example, if you click What is the impact of this issue?, the following answer is displayed. - (Optional) Click any other question to obtain more details about the situation.
- Continue with To view similar situations.
To view similar situations
The Similar Situations section is displayed if at least two occurrences of similar situations are identified within a 24-hour window. This section is not displayed for a primary situation. The Aggregated View grid is displayed if there are situations available for a week, and the Detailed View grid is available if there are situations for a month.
- In the Similar Situations section, view the first and the most recent occurrences of similar situations.
- To analyze the pattern and trend of similar situations associated with the same service node, use Aggregated View or Detailed View:
- Aggregated View: Shows the occurrences of similar situations against the days of the week. The Y-axis represents the days of the week starting from Sunday to Saturday. The X-axis shows the hourly time slot for 24 hours.
- Detailed View: Shows the occurrences of similar situations in the last 30 days.
The Y-axis represents the day and date, and the X-axis represents the hourly time slot for 24 hours. This view captures data between the first and the most recent occurrence for the last 30 days. The detailed view is displayed even if the data is available for a single day.
- Aggregated View: Shows the occurrences of similar situations against the days of the week. The Y-axis represents the days of the week starting from Sunday to Saturday. The X-axis shows the hourly time slot for 24 hours.
- Click Show More to view similar situation details such as the time of occurrence, number of related events, type, severity, priority, status, and incident ID.
Clicking the Incident ID link opens the incident in BMC Helix IT Service Management (requires a subscription to BMC Helix IT Service Management). - (Optional) Click the action menu
to perform actions on a situation.
For more information, see Performing-situation-actions. - Continue with To view situation explanation.
To view the situation explanation
- In the Situation Explanation section, use the Root Cause View or List View to analyze the root cause events associated with the situation.
- Root Cause View: Shows the impact flow of events in a situation in a graphical format.
Based on the temporal and topological relationships between various causal events in the situation, the ML algorithm determines the root cause event and consequent events. Each event in the graph is aligned against the corresponding CI kind. The direction in the graph indicates the impact flow from the root cause event. You can see the impact score percentage displayed with the event. The total impact score from all the events adds up to 100 percent.- Hover over an event to view the impacted node details and the corresponding CI or CI kind highlighted in the CI topology and analysis section.
- Click an event to view additional details on the Situation Details pane.
List View: Displays all causal events and details such as the event messages, impacted host, occurrence time, severity, priority, status, and incident ID.
If Proactive Service Resolution is enabled in BMC Helix Intelligent Automation, the same incident ID is displayed against the situation and the events.
- Root Cause View: Shows the impact flow of events in a situation in a graphical format.
- Click an event message to view the following details in the Event Details pane:
- Event name, event score, severity, priority, status, and the More Details link to view the additional event details in BMC Helix Operations Management
- Event assignee details
- Date when the event first occurred or was last modified
Event summary showing the Class, Incident ID, Object Class, Object, and Host. Clicking the Incident ID link opens the incident in BMC Helix IT Service Management.
For more information about event classes and objects, see EVENT base event class..- Logs and notes history: All logs and notes for an event are displayed. Type a note in the text box and click Add Note to add any additional notes related to the event. Any note added for the event is reflected in the event in BMC Helix Operations Management.
- Performance view: If the slot value for the event class is Alarm, the time-series data collected from the key attributes of the causal events of ML-based situations is displayed.
Click the action menu
to perform event actions.
- Continue with To view CI topology and analysis.
To view CI topology and analysis
- In the CI topology and analysis map, view the topology map of the situation, the impacted CIs, and the probability of the impact on the connected CIs.
- Use the following options to view the map based on your requirements:
- Views: Switch between the Organic or Hierarchic view to view the impact flow.
In the organic view, nodes are placed close to their adjacent nodes, thus saving space. While, in the hierarchic view, the nodes are distributed into layers, which facilitate the identification of dependencies and relationships among the nodes. - Advanced filters: Apply filters to filter and view the impact based on the selected filters.
- Grouping: Click Enable Grouping by CI Kind
to view the topology map grouped by the CI kind.
- Search: If there are many CIs in the hierarchy, use the search box to locate a particular CI.
- Legend: Click to view the legends used for the topological map.
- Use the other tools to zoom in, zoom out, or view the map on a full screen.
- Views: Switch between the Organic or Hierarchic view to view the impact flow.
Where to go from here
To perform additional actions on a situation or on the events included in the situation, see Performing-situation-actions.