Total incident count and mean time to resolve (MTTR) indicators for a reliable incidence-response process
The Overview page displays the total incidents for a selected time range. This count includes incidents in the Open, Assigned, In Progress, Pending, and Reopened states. Closed, Cancelled, and Resolved incidents are not included in this count.
In the example, there are 35 incidents in the last 24 hours:
Mean time to resolve (MTTR)
MTTR represents the average time taken to resolve a set of incidents. This metric includes the time spent during the alert and diagnostic processes before repair activities are initiated. In other words, MTTR describes both the reliability and availability of a system. Reliability refers to the probability that the service will remain operational over its life cycle. Availability refers to the probability that a system will be operational at any point in time. The shorter the MTTR, the higher the reliability and availability of the system.
The Overview page displays the MTTR and its trend for a selected time range, as shown in the following example. In the example, the average time taken to resolve 4 incidents in the last 7 days is 2 days and 3 hours:
MTTR computation
MTTR is computed as
MTTR = The time taken to resolve the incidents for a selected time range/Total incidents resolved for a selected time range
Example
Total incidents resolved in the last 7 days = 4
Time range selected is Last 7 days
Time taken to resolve these 4 incidents:
- Incident 1 was resolved in 44 hours
- Incident 2 was resolved in 100 hours
- Incident 3 was resolved in 40 hours
- Incident 4 was resolved in 20 hours
MTTR = (44 + 100 + 40 + 20)/4 = 204 / 4 = 2d 3h