Remediating events automatically by using automation policies

This use case describes how BMC IT performs intelligent recovery of business applications and improves MTTR by automating remediation actions with BMC Helix Intelligent Automationfor frequently occurring events in BMC Helix AIOps. 

Customer success: BMC IT improves MTTR by automating remediation actions for events

BMC IT uses BMC Helix AIOps to monitor the IT infrastructure. When a service or process is down, typically, an operator or a site reliability engineer (SRE) spends hours investigating the event, creating an incident, and if needed, restarting the service or process. When a business critical process is down, it causes a service outage that can last for a significant amount of time until the problem is investigated and remediated.

BMC IT uses the advanced Intelligent Automations feature provided by BMC Helix AIOps to automatically remediate the process down events by restarting the processes. Automation engineers create automation policies in BMC Helix Intelligent Automation, which appear as automation actions against the events in BMC Helix AIOps. 

In the following example, a situation in BMC Helix AIOps notifies the operator or SRE that an important process is down and shows the automation actions available against each event included in the situation. 

BMC IT uses automation to restart a process without any manual intervention. After the automation is run, the status and the incident ID are displayed for the event. 

An operator or SRE can view the details of the automation in BMC Helix Intelligent Automation by using the cross-launch link (appropriate permissions needed). 


Workflow

Perform the following tasks to make remediation actions available as automations for events in BMC Helix AIOps:

TaskProductRoleActionReference
1.

BMC Helix AIOps

Tenant AdministratorEnable Intelligent Automations feature from the Configurations menu.Enabling the AIOps features
2. (Optional

BMC Helix AIOps

Operator or SRE(Optional) Request automation for an event under the Services or Situations menu.Requesting for a new automation
3. 

BMC Helix Intelligent Automation

Automation Engineer

Based on an incoming request or for frequently occurring issues, create an automation policy that contains remediation actions. Automation engineers can set the execution mode to Automatic to trigger remediation actions automatically.  

Creating automation policies
4. 

BMC Helix AIOps

Operator or SREView events and run the automation actions available against the event. Running an existing automation


Results

By implementing the remediation workflow, BMC IT achieved the following results:

  • Automated remediation of frequently occurring issues, which saved the need to manually investigate the event, create incidents, and restart the processes that stopped running.
  • Capability to request automations if automation actions are not available yet.
  • Increased system reliability and improved MTTR from 30-40 minutes to less than five minutes.
  • Reports for analyzing results driven by the automated remediation actions.
Was this page helpful? Yes No Submitting... Thank you

Comments