Detecting anomalies

Detecting anomalies can help you find unusual patterns in the data that you are monitoring. Anomalies not only tell you if there is a change in the usual pattern, but it also tells you how much the new pattern deviates from the usual pattern both visually and numerically. Additionally, it shows you infrequent occurrences of events.

This information can help you gain important insights that in turn help you uncover new problems and detect the root cause of a problem more quickly.

To get an overview and understand the use cases for viewing anomalies, see Getting started with anomaly detection.

This topic provides the following information about viewing and understanding anomalies:

Enabling or disabling anomaly detection

Anomaly detection is enabled by default. If you are upgrading from a previous version of TrueSight IT Data Analytics, whatever settings you had specified for Anomaly Detection will be carried forward to the next version. If you had disabled anomaly detection earlier and want to enable it, or vice-a-versa,  you can use the configureanomaly CLI command. For more information about using the command line to enable or disable anomaly detection, see configureanomaly CLI command.

Note

Anomaly detection requires additional resources (in terms of CPU and maximum Java heap size). BMC recommends you to contact BMC Support to help you tune your setup and get optimal product performance.

Viewing anomalies

Anomalies can be viewed in context of a search. To view anomalies, you do not require any specific domain knowledge about the data being monitored. For example, searching for a specific error or warning message is not required to view anomalies in a given time period. Typically, just a time range and optionally a tag value is sufficient to isolate anomalies. This means when you perform a search, you can select a time range and a tag value and based on this search context, you can see anomalies. For more information about tags, see Understanding tags.

Anomaly detection takes into consideration the hosts available under the Filters panel (during the search). When you view anomalies, the host information is identified and displayed next to the anomalous events to enable you to get better insights.

To view anomalies

  1. Ensure that anomaly detection is already enabled.
  2. Navigate to the Search tab and perform a search.
  3. On the top-left of the page, click the vertical three dots (indicating a menu) next to All Data and select Analyze Data.
    The Coalesce page is displayed.
  4. From the Coalesce page, turn on the Anomalies setting to view anomalies.
    This results in a summary of anomalous events categorized as out-of-range events and rare events.

Understanding how anomalies are detected

If you try to view anomalies for any new type of data collected, all the events (coming from that data) might be indicated as infrequent occurrences (or rare events). For example, if you try to view anomalies after enabling anomaly detection for the first time or after adding a new type of data collector, it is possible that initially all the events are indicated as rare events. This is because the product is still forming the common pattern based on which anomalies are detected. Over time, the pattern is adjusted and slowly you can expect to see more accurate results.

By default, the anomaly job is run every 15 minutes. This means if you run a search for a period just after the anomaly job was last run, it is possible that you do not see current results. In such a scenario, a message indicating the same is displayed on the Anomaly page.

Anomalies are detected based on the common patterns learnt. To learn common patterns, TrueSight IT Data Analytics isolates a window of seven days before search time. This means when you search, anomalies are detected based on the common pattern formed in the last seven days before the search time context. For example, if you search for the last 24 hours, anomalies are detected based on the last seven days before the last 24 hours search time context.

The common patterns learnt are preserved in the system, until the expiry of the maximum data retention period (available under Administration > System Settings) and for an additional seven days. For example, if your maximum data retention period is set to 14 days, then the common patterns learnt are preserved in the system for 14 + 7 equals 21 days.

Based on the number of times the common pattern occurs, a normal range is established by the product. Any event falling out of this range is indicated as Outlier events and any event that cannot be included in the range (because that event was never seen before or seen infrequently) are indicated as rare events.

Outlier events

Outlier events occur when the number (or count) of events matching a specific pattern type (that is the common pattern) is outside the normal range for a given period.

For example, suppose 20-30 events matching pattern X usually occur. But in the last one hour, 200 events of pattern X occurred. Such an instance would be considered as an outlier event. The current pattern of events (200) deviates from the common pattern (20-30) on a higher side. Therefore, this deviation is visually plotted on the bar (representing the deviation factor) under the Deviation column, on the right side of the splitting line.

Example

Suppose you are monitoring logs that provide information about daily logins and user activity for an application (with anomaly detection enabled). Suppose the normal range for the common pattern identified for this kind of information is 40. This means around 40 users log on to the application on a daily basis.

Scenario 1: If on a particular day, there are only 10 users that have logged on to the application, this will be indicated as an anomaly on the Outlier tab. This is because the count of 10 deviates from the normal range of 40 (on a lower side). By looking at the Outlier tab, you can infer that perhaps some users are facing issues in logging on to the application or it is possible that many people working on the application are on leave.

Scenario 2: If on a particular day, there are 1000 attempts to log on to the application, this will be indicated as an anomaly on the Outlier tab. This is because the count of 1000 largely deviates from the normal range of 40 (on a higher side). By looking at the Outlier tab, you can infer that perhaps an attacker is trying to attempt a brute force attack to cause denial of service for a large number of user accounts.

The following table describes the information available under the Outlier tab:

Column nameDescription
Deviation

The bar displayed under this column visually represents the deviation factor.

The deviation factor tells you whether the anomalous record deviates from the common pattern on the lower side or the higher side and the amount by which the deviation occurs –  this means how far the anomalous records lie as compared to the normal range.

The bar representing the deviation factor is split into two parts by a vertical line. If the deviation is on the higher side, the right side of the line is colored and if the deviation is on the lower side, the left side of the line is colored (as shown in the following figure).

The amount by which the deviation occurs is indicated by the portion of the bar colored after the splitting line and the deviation factor represented in terms of percentage on top of the bar. This visual representation can also help you compare deviations across various anomalous records.

When you click the bar or the numeric figure (deviation factor) on top of the bar, by default, two sections, placed adjacent to each other, are displayed. The left-hand side section, Historical Data, displays historical data with respect to the chosen signature for the time period immediately preceding the selected time period and it is based on the configured retention period. The right hand side section, Current Search Data, displays data for the currently selected time period. Both the charts display the timeline on the x-axis and the number of events on the y-axis. You can zoom into the first chart by clicking and dragging the sliders given at the bottom of the chart. By using these sliders, you can select the date/time range which you want to zoom in on.

To zoom in, you can also use Shift + Scroll (on the mouse).

The charts also display the range of normal observations, outside of which a data point is considered as an outlier, in a grey shaded portion as shown in the following figure.

The representation considers any change in intervals between successive baselines and normalizes the range of normality accordingly.

You can drill down into the context of data points displayed from the baselined data by selecting a time range, using the zoom slider and then clicking on the data-point-of-interest.

When you click on the three dots menu in front of Chart, you can select Show Top 10 (by count) to display top ten records occurring in the coalesced anomalous record (as shown in the following figure).

To see the bottom ten records, click the three vertical dots menu next to Top Records, and select Show Bottom 10 (by count). Note that the Show Bottom 10 (by count) option appears only when there are that many records to show.

To return to the top ten records list, click the three vertical dots menu and select Show Top 10 (by count). To search with the message of the individual record and see the results arising, click Launch Search next to the message.

You can also sort the anomalous records in an ascending or descending way by selecting the Sort Ascending or Sort Descending option from the three dots menu on this column (as shown in the following figure). Based on the sorting order that you select, an associated arrow is displayed next to the column name.

Hosts

The host information corresponds to the search query based on which the anomalies are detected.

The hosts selected under the Filters panel during the search determine the hosts displayed next to the anomalous records. This information can provide you better insights about the anomalous records and help you take better decisions.

You can narrow down the anomalous records displayed by selecting hosts from the three dots menu on this column.

Anomalous records

Because anomalies are detected based on the coalesced results, in the table, you can see that the anomalous records display only the common (coalesced) message with ellipsis (...) (as shown in the following figure). The ellipsis indicates varying portions in the message. This coalesced message gives you a high-level idea of the records deviating from the normal range.

To drill down into the common (or coalesced) message, click the bar (right portion) displayed under the Deviation column or click the numeric figure (deviation factor) displayed on top of the bar, next to the anomalous record. By drilling down, you can see the most common (top ten records) and the least common (bottom ten records) occurring for the coalesced anomalous record.

Launch Search

Runs a search with the anomalous record message and see results on the All Data page.

Rare events

Rare events occur when the number (or count) of events matching a specific pattern type (that is the common pattern) has occurred less than five times in a given time period. Rare events are those that are considerably different as compared to rest of the data.

Example

Suppose you are monitoring logs that provide information about the features that are most used and least used in an application (with anomaly detection enabled).

In this scenario, you can look at the Rare tab to find features that were least used or never used before.

The following table describes the information available for the rare events:

Column nameDescription
Previous occurrences

The bar displayed under this column visually represents the count of the rare event occurrences. This count is also represented on top of the bar, for example, Never seen before, Seen 1 time, and so on.

The bar displayed is colored based on the count of rare occurrences. If the count is high, a larger portion of the bar is colored and vice versa. This visual representation can help you compare the count across other rare events (anomalous records).

Note that the count of rare occurrences refers to the number of times the same pattern (anomalous message) was seen in the last seven days before search time. The messages matching the common pattern might have occurred for a 1000 times, however, if this pattern has been seen on only two occasions in the last seven days, then the number of previous occurrences is represented as Seen 2 times (as shown in the following figure).

When you click the bar or the numeric figure on top of the bar, by default, two sections, placed adjacent to each other, are displayed. The first graph, Historical Data, displays rare occurrences in the historical data, with respect to the chosen signature, for the time period immediately preceding the selected time period and it is based on the configured retention period. The right hand side section , Current Search Data, displays data for the currently selected time period. Both the charts display the timeline on the x-axis and the number of events on the y-axis. You can zoom into the first chart by  clicking and dragging the sliders given at the bottom of the chart. By using these sliders, you can select the date range which you want to zoom in on. Please note that the chart display is not provided for rare anomalies which are Never Seen Before.

To zoom in, you can also use Shift + Scroll (on the mouse).

The charts also display the range of normal observations, outside of which a data point is considered as an outlier, in a grey shaded portion as shown in the following figure.

You can drill down into the context of data points displayed from the baselined data by selecting a time range, using the zoom slider and then clicking on the data-point-of-interest.

When you click on the three dots menu in front of Chart, you can select Show Top 10 (by count) to display top ten records occurring in the coalesced anomalous record (as shown in the following figure).

To see the bottom ten records, click the three vertical dots menu next to Top Records, and select Show Bottom 10 (by count). Note that the Show Bottom 10 (by count) option appears only when there are that many records to show.

To return to the top ten records list, click the three vertical dots menu and select Show Top 10 (by count). To search with the message of the individual record and see the results arising, click Launch Search next to the message.

.

You can also sort the anomalous records in an ascending or descending way by selecting the Sort Ascending or Sort Descending option from the three dots menu on this column (as shown in the following figure). Based on the sorting order that you select, an associated arrow is displayed next to the column name.

Hosts

The host information corresponds to the search query based on which the anomalies are detected.

The hosts selected under the Filters panel during the search determine the hosts displayed next to the anomalous records. This information can provide you better insights about the anomalous records and help you take better decisions.

The hosts selected under the Filters panel during the search determine the hosts displayed next to the anomalous records. This information can provide you better insights about the anomalous records and help you take better decisions.

You can narrow down the anomalous records displayed by selecting hosts from the three dots menu on this column.

Anomalous records

Because anomalies are detected based on the coalesced results, in the table, you can see that the anomalous records display only the common (coalesced) message with ellipsis (...) (as shown in the following figure). The ellipsis indicates varying portions in the message.

The common message gives you a high-level idea of the anomalous event. To drill down into the common (or coalesced) message, click the bar (right portion) displayed under the Deviation column or click the numeric figure (deviation factor) displayed on top of the bar, next to the anomalous record. By drilling down, you can see the most common (top ten records) and the least common (bottom ten records) occurring for the coalesced anomalous record.

Launch Search

Runs a search with the anomalous record message and see results on the All Data page.

Changing the matching factor for viewing anomalies

Anomalies are identified based on the coalesced search results. By default, you can view anomalies based on a matching factor of 70%. This means anomalous records are grouped based on at least 70% similarity (in other words, maximum 30% variation).

You can change the matching factor by using the configurematchingfactor CLI command. For more information, see configurematchingfactor CLI command.

For more information about matching factor, see Viewing coalesced results.

Was this page helpful? Yes No Submitting... Thank you

Comments