Detecting incidents

An incident is an episode characterized by abnormal performance, availability, or web traffic volume as compared to a baseline. Examples of incidents include:

Exceptionally low traffic volume
Poor performance
Service level threshold (SLT) violations
Frequent availability errors
Periods of web application unavailability

The system prioritizes incidents by type of abnormality, duration, and the number of users affected.

To detect incidents, set up your system to monitor the appropriate parameters and take the appropriate measurements, as described in the following topics:

What does the system monitor for incidents?

The system can monitor Watchpoints, SLTs, and error conditions for incidents:

Watchpoints — The system identifies performance, traffic volume, and some availability incidents by monitoring specified Watchpoints.

Service Level Thresholds — To identify performance-related incidents, the system relies on page SLTs, which are rules that determine whether a page was delivered to your end users within an acceptable time. By default, the system detects an incident when some predefined percentage of requests to your web application violate their SLTs. The exact percentage is configurable in the incident settings.

Error rules — The system can monitor error rules, so that when a certain error occurs, it is reported as an incident.

Note

The system does not support monitoring of informational conditions for incidents.

Measuring abnormality

Abnormality describes how exceptional or atypical an incident is.

The system measures abnormality based on a logarithmic scale. It establishes a baseline for normal trends in your web traffic, and then calculates the number of standard deviations from the baseline that your web traffic exhibits during a 30-minute period.

Abnormality

The number of standard deviations away from the baseline is the incident's abnormality. Each type of incident has its own type of abnormality, as described in the following table:

Incident type	Type of abnormality
Availability	The system detected an abnormally high percentage of requests with errors for a particular Watchpoint.
Performance	The system detected an abnormally high percentage of pages that violated their SLTs for a particular Watchpoint.
Volume	The system detected an abnormally high or abnormally low amount of traffic on a particular Watchpoint.

By default, the system represents abnormality as an index value from 0 to 4.

Abnormality rating based on multiples of the standard deviation

Number of standard deviations from baseline	Abnormality rating
Less than or equal to 1	0
1 – 2	1
2 – 4	2
4 – 16	3
Greater than or equal to 16	4

Measuring affected user sessions

For each type of incident, the system calculates the number of user sessions affected (the number of distinct sessions detected for the Watchpoint during the specified period when the incident occurred).

Monitoring incidents

To monitor incidents that occur in your web traffic, you can:

Watch the Incidents page of a Real User Analyzer component, which provides the list of incidents occurred during the specified period of time
Receive incident alerts via email or SNMP traps
Monitor the incident-related dashlets on the Real User Analyzer dashboard
Monitor Watchpoints for incidents

Detecting incidents

What does the system monitor for incidents?

Measuring abnormality

Measuring affected user sessions

Monitoring incidents

Related topics

TrueSight Application Management 11.3

On this page