Detecting major incidents


As major incidents are high-criticality and high-impact incidents that typically affect a large number of users within an organizations, the major incident management process becomes much more critical and urgent for the IT operations team. Since it demands a detailed analysis and requires a team of specialist to work on it, organizations need to establish a robust major incident management process that is appropriately supported by data.

The benefits include:

  • Service Desk Managers can monitor the incoming incidents in real time for a major incident candidates.
  • Major Incident Managers can then rally the team to create or convert a major incident candidate and conduct further analysis as part of the Major Incident Management process.
  • Major incidents can be identified early. When major incidents are identified early, it minimizes loss of time, cost, and business value.

Customer success: Service Desk Managers monitor the incoming incidents in real time for major incident candidates.

Scenario

Scenario

Susan, the Service Desk Manager at Invention Inc. monitors the Real-time Incident Correlation workspace to view the trend of incoming incidents. She notices that the CRM-Enterprise-Monitoring cluster has a possible major incident. Many users have reported that the CRM application is taking a long time to load and perform routine actions. The Sales and Marketing team across various offices at Invention Inc. are affected as the Sales team is unable to create reports or view the sales pipeline, and the Marketing campaigns cannot be run.

Susan finds that 54 new tickets were added in the cluster in the last one hour. She finds the parent incident in the cluster and marks the incident as a candidate for a major incident in Smart IT.
She then assigns it to the Major Incident Manager. The Major Incident Manager in Invention Inc. uses BMC Helix ITSM Major Incident Management to track and manage this major incident.

Workflow for detecting major incidents

The following graphic describes the tasks to be performed when detecting and managing major incidents:

MI Detection Workflow.png

The following table describes the tasks to be performed when detecting and managing major incidents:

Task

Component

Role

Action

Reference

1

Real-time incident correlation configuration in ITSM Insights

Service Desk Manager

Configures the trend and major incident settings

2

Real-time Incident Correlation dashboard in ITSM Insights

Service Desk Manager

  • Checks the dashboard for probable major incident indicators
  • Drills down into the clusters and views the incidents in the cluster
  • Clicks an incident in the drill-down view. The incident opens in a new tab in BMC Helix ITSM.

3

Incident Management in BMC Helix ITSM

  • Service Desk Manager
  • Service Desk agent
  • Marks the incident as a candidate for a major incident or creates a new incident as a a candidate for a major incident in BMC Helix ITSM
  • Assigns the incident to a Major Incident Manager

4

Incident Management in BMC Helix ITSM

Major Incident Manager

  • Analyses the impact and accordingly categorizes the incident
  • Forms the Major Incident Management team

Results

The Major Incident Management team works with the relevant teams to fix the CRM application issues. The CRM application is now available and the users can work on their routine tasks.

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*