Situations overview
The following figure shows how situations are created in BMC Helix AIOps:
Types of situations
Events that are correlated and aggregated based on the event correlation policies created in BMC Helix Operations Management are known as policy-based situations. For more information, see Policy-based situations.
Events that are correlated and aggregated based on the AI/ML algorithms are known as ML-based situations.
On the Situations page, this list icon indicates that it’s a policy-based situation and this ML-based
icon indicates an ML-based situation.
Independent, primary, and similar situations
Based on the type of correlation, situations are categorized into independent, primary, or similar situations.
On the Situations page, you can view:
- All Situations: Lists the independent situations, groups of all primary situations, and similar situations. An independent situation is an ML-based situation that does not belong to any grouped situation. Primary situations reduce noise by grouping similar open situations that occurred due to a similar issue and impacted multiple services across a service hierarchy. Instead of troubleshooting each service and its situation separately, operators or site reliability engineers (SREs) can troubleshoot the primary situation, indicated by this
icon. BMC Helix AIOps leverages AI/ML algorithms to find situation similarity due to temporal, topological, or knowledge graph relationships. This information helps reduce the event noise and improves MTTR.
Primary situations are formed based on the following options defined on the Manage Situations page under Configurations:
- Correlation Event Time Window (in mins): Configure the time limit to determine whether a correlation event will be a part of a situation.
- Situation Stability Window (in mins): Configure the time limit to add new correlation events to a situation.
- Similar Situations: Lists groups of all similar situations from the same service node. BMC Helix AIOps uses AI/ML algorithms to group situations of a similar nature based on their repeated impact on a service in the past. Operators or SREs can perform historical analysis on problems, look at the number of incidents raised, automation runs, the severity of situations, and time of past occurrences, and take meaningful actions. Similar situations also help in faster root cause isolation. For any open situation, similar situations provide much-needed context to understand how a similar situation was resolved in the past, the actions taken to resolve it, and the root cause that was identified. Based on this contextual information, operators or SREs can perform similar actions to diagnose and remediate the situation and therefore reduce MTTR.
Similar situations are formed based on the following options defined on the Manage Situations page under Configurations:
- Expiry of Similar Situation Group (in days): Configure the maximum number of days a group of similar situations can remain idle, before expiring.
- Similar Situation Detection Window (in hours): Configure the hourly interval for detecting similar situations to run and form groups.
BMC HelixGPT-based summary, best action recommendations, log insights, and a virtual agent – Ask HelixGPT
The following video (5:05) provides an overview of the transformative power of BMC Helix AIOps and observability enhanced by BMC HelixGPT:
Watch the YouTube video about Unlocking AIOps and Observability with BMC HelixGPT.
BMC Helix AIOps connects with BMC HelixGPT to leverage the generative AI capabilities that help operators or SREs understand the root cause of a situation faster by providing a human-readable AI-generated situation summary. This summary gives a synopsis of a short problem classification, a brief root cause summary, and a causal summary explaining the complete context.
Best action recommendations
By using the generative AI capabilities, BMC HelixGPT provides a step-by-step action plan for remediating the situation. These remediation steps are called best action recommendations and can be used by the operators or SREs to resolve the events. Best action recommendations help close situations faster and improve the mean time to resolve (MTTR).
BMC HelixGPT generates these recommendations by evaluating information from the following sources:
- If similar situations occurred in the past, BMC HelixGPT looks for the resolution notes that might have been added during the closing of incidents in BMC Helix IT Service Management for these similar situations.
- If no similar situations are found, BMC HelixGPT evaluates the resolution notes added in incidents of related events.
- If no incidents are available, event messages of the root cause events are evaluated, and recommendations are suggested.
BMC Helix AIOps can connect with BMC Helix IT Service Management, Jira Service Management, or ServiceNow ITSM through BMC HelixGPT to generate the best action recommendations based on the incidents in these supported incident management systems. For more information about configuring incident sources in BMC HelixGPT, see Adding data sources in BMC HelixGPT.
With the remediation steps, a code wizard provides sample scripts that can be used for performing the recommended step in Ansible, Python, and Bash. For example, if a situation is created for a network bandwidth utilization issue identified on a host, one of the recommended actions could be to increase the disk size space. The code wizard generates a script to increase the network bandwidth, which can be used to implement automation and resolve similar issues in the future.
Log insights
BMC Helix AIOps connects with the supported log repositories through BMC HelixGPT to analyze, summarize, and derive actionable insights from the logs related a service. Apart from connecting with BMC HelixGPT, BMC Helix AIOps also connects with external log data sources such as Splunk Enterprise and ElasticSearch to leverage your existing log repositories. For more information about configuring the data sources, see Adding data sources in BMC HelixGPT.
The actionable insights are displayed under the Log Insights option and can be used to identify the root cause of situations. Operators or SREs can use the cross-launch link to view the detailed logs in the supported log repositories.
Ask HelixGPT
An integrated virtual agent, Ask HelixGPT, leverages the BMC HelixGPT generative AI capabilities to answer the following predefined questions about a situation:
- What is the impact of the issue?
- Which team can solve this issue?
- Has this situation happened in the past?
- Are there any change windows active during this situation?
BMC HelixGPT generates these answers by evaluating information from the incidents created for similar situations in BMC Helix IT Service Management, analyzing timestamps and patterns of similar situations that have occurred in the past, analyzing the service health score of the impacted service of the situation, and the change requests associated with the situation.
By using these BMC HelixGPT capabilities, operators can improve operational efficiency, derive insights from all connected sources, and reduce manual errors by implementing automation to resolve problems faster.
Incident management for situations
BMC Helix AIOps connects with BMC Helix IT Service Management to show incidents created for situations. If the Proactive Service Resolution is configured in BMC Helix Intelligent Automation, instead of separate incidents for a situation and its events, a consolidated incident is created for a situation. For more information, see Overview of Proactive Service Resolution.
Where to go from here
To view the primary, independent, and similar ML-based situations, see Monitoring-situations.