Total incident count and mean time to resolve (MTTR) indicators for a reliable incidence-response process


Incidents

An incident is any event that is not part of the standard operation of a service and that causes an interruption or a reduction in the quality of that service. 

What are the incident sources?

The Total Incidents widget displays INCIDENT_INFO events from BMC Helix Operations Management.

The Overview page displays the total incidents for a selected time range. This count includes incidents in the Open, Assigned, In Progress, Pending, and Reopened states. Closed, Cancelled, and Resolved incidents are not included in this count.

In the example, there are 35 incidents in the last 24 hours:

total_incidents_concept_243.png

Mean time to resolve (MTTR)

MTTR represents the average time taken to resolve a set of incidents. This metric includes the time spent during the alert and diagnostic processes before repair activities are initiated. In other words, MTTR describes both the reliability and availability of a system. Reliability refers to the probability that the service will remain operational over its life cycle. Availability refers to the probability that a system will be operational at any point in time. The shorter the MTTR, the higher the reliability and availability of the system.

What is the source of incidents for MTTR computation?

To compute MTTR value, BMC Helix AIOps considers INCIDENT_INFO events from BMC Helix Operations Management.

The Overview page displays the MTTR and its trend for a selected time range, as shown in the following example. In the example, the average time taken to resolve 4 incidents in the last 7 days is 2 days and 3 hours:

mttr_concept_243.png

MTTR computation

MTTR is computed as

MTTR = The time taken to resolve the incidents for a selected time range/Total incidents resolved for a selected time range

Example

Total incidents resolved in the last 7 days = 4

Time range selected is Last 7 days

Time taken to resolve these 4 incidents:

  • Incident 1 was resolved in 44 hours 
  • Incident 2 was resolved in 100 hours 
  • Incident 3 was resolved in 40 hours 
  • Incident 4 was resolved in 20 hours

MTTR = (44 + 100 + 40 + 20)/4 = 204 / 4 = 2d 3h

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*