Detecting anomalies
Enabling or disabling anomaly detection
Anomaly detection is enabled by default. If you are upgrading from a previous version of TrueSight IT Data Analytics, whatever settings you had specified for Anomaly Detection will be carried forward to the next version. If you had disabled anomaly detection earlier and want to enable it, or vice-a-versa, you can use the configureanomaly CLI command. For more information about using the command line to enable or disable anomaly detection, see configureanomaly CLI command.
Viewing anomalies
Anomalies can be viewed in context of a search. To view anomalies, you do not require any specific domain knowledge about the data being monitored. For example, searching for a specific error or warning message is not required to view anomalies in a given time period. Typically, just a time range and optionally a tag value is sufficient to isolate anomalies. This means when you perform a search, you can select a time range and a tag value and based on this search context, you can see anomalies. For more information about tags, see Understanding-tags.
Anomaly detection takes into consideration the hosts available under the Filters panel (during the search). When you view anomalies, the host information is identified and displayed next to the anomalous events to enable you to get better insights.
To view anomalies
- Ensure that anomaly detection is already enabled.
- Navigate to the Search tab and perform a search.
- On the top-left of the page, click the vertical three dots (indicating a menu) next to All Data and select Analyze Data.
The Coalesce page is displayed. - From the Coalesce page, turn on the Anomalies setting to view anomalies.
This results in a summary of anomalous events categorized as out-of-range events and rare events.
Understanding how anomalies are detected
If you try to view anomalies for any new type of data collected, all the events (coming from that data) might be indicated as infrequent occurrences (or rare events). For example, if you try to view anomalies after enabling anomaly detection for the first time or after adding a new type of data collector, it is possible that initially all the events are indicated as rare events. This is because the product is still forming the common pattern based on which anomalies are detected. Over time, the pattern is adjusted and slowly you can expect to see more accurate results.
By default, the anomaly job is run every 15 minutes. This means if you run a search for a period just after the anomaly job was last run, it is possible that you do not see current results. In such a scenario, a message indicating the same is displayed on the Anomaly page.
Anomalies are detected based on the common patterns learnt. To learn common patterns, TrueSight IT Data Analytics isolates a window of seven days before search time. This means when you search, anomalies are detected based on the common pattern formed in the last seven days before the search time context. For example, if you search for the last 24 hours, anomalies are detected based on the last seven days before the last 24 hours search time context.
The common patterns learnt are preserved in the system, until the expiry of the maximum data retention period (available under Administration > System Settings) and for an additional seven days. For example, if your maximum data retention period is set to 14 days, then the common patterns learnt are preserved in the system for 14 + 7 equals 21 days.
Based on the number of times the common pattern occurs, a normal range is established by the product. Any event falling out of this range is indicated as Outlier events and any event that cannot be included in the range (because that event was never seen before or seen infrequently) are indicated as rare events.
Outlier events
Outlier events occur when the number (or count) of events matching a specific pattern type (that is the common pattern) is outside the normal range for a given period.
For example, suppose 20-30 events matching pattern X usually occur. But in the last one hour, 200 events of pattern X occurred. Such an instance would be considered as an outlier event. The current pattern of events (200) deviates from the common pattern (20-30) on a higher side. Therefore, this deviation is visually plotted on the bar (representing the deviation factor) under the Deviation column, on the right side of the splitting line.
The following table describes the information available under the Outlier tab:
Rare events
Rare events occur when the number (or count) of events matching a specific pattern type (that is the common pattern) has occurred less than five times in a given time period. Rare events are those that are considerably different as compared to rest of the data.
The following table describes the information available for the rare events:
Changing the matching factor for viewing anomalies
Anomalies are identified based on the coalesced search results. By default, you can view anomalies based on a matching factor of 70%. This means anomalous records are grouped based on at least 70% similarity (in other words, maximum 30% variation).
You can change the matching factor by using the configurematchingfactor CLI command. For more information, see configurematchingfactor-CLI-command.
For more information about matching factor, see Viewing-coalesced-results.