BMC ProactiveNet thresholds are those elements that generate events based on the infrastructure performance data. This document describes the various types of thresholds, their interaction with the BMC ProactiveNet environment, its usages, and some management practices.
There are three types of thresholds that are native to the BMC ProactiveNet Server: Absolute, Signature, and Abnormality thresholds. In addition, BMC ProactiveNet PATROL KMs have its own thresholds, which are not directly managed by the BMC ProactiveNet Server, but which are capable of generating events that integrate with BMC ProactiveNet Server Analytics.
The term Intelligent Threshold is used liberally throughout product documentation. It simply means the usage of baselines in any of the three native BMC ProactiveNet threshold types.
These are thresholds that are straight-forward to understand, where in order for an event to get generated, one hundred percent of the performance data points have to satisfy the trigger condition.
A trigger condition is defined to be:
Absolute thresholds' event-triggering behaviors are easier to grasp because of the requirement that all performance data points must exceed the trigger condition for a specified duration before an event gets created. The absence of the trigger condition for the same duration will close the event.
Having the baseline in the trigger condition allows a threshold to generate events based on learned behavior.
The following figure illustrates a typical trigger condition for an absolute threshold.
Absolute thresholds with outside baseline are thresholds which combine both absolute thresholds (static threshold value) and dynamic thresholds (which use the baseline) as the threshold. The user can still set the threshold value with but as additional criteria for alarm generation.
Absolutes with no baselines are for the availability metrics, or other metrics that have discrete, state based values; that is, 0=OK, 1=Down, 2=Admin Down.
Absolutes with baselines are for user influenced performance metrics with normalized values (0-100) or with set definitions of values for CPU utilization, memory utilization, process util, etc.
Where Absolute thresholds tend to fail to create events are scenarios where the performance data take on a more transient behavior, rendering the trigger condition less obvious. Signature thresholds can be utilized here because it can make a judgment when enough data points are needed to trigger an event, as illustrated below.
Due to the non-deterministic nature of the performance data, it might take longer than anticipated for a Signature threshold to generate an event.
The difference between Absolute and Signature thresholds is in the percentage of data points that must meet the trigger condition. 100% for Absolute thresholds and just enough for Signature thresholds.
Signature threshold for performance metrics that have no set concept of value is completely dependent on the attribute and the instances (transaction response time, Ping Response time, etc) being monitored.
For example, if the ping response time is 900ms; the Signature threshold is poor if it is to a box in the same data center across a gigabit switch. The Signature threshold is considered good if it is to a box across the continent.
Absolute and Signature thresholds generate actionable events for those events with severity of MINOR, MAJOR, or CRITICAL, Abnormality thresholds create events that are meant to be invisible and labeled with severity INFO. Otherwise, Abnormality thresholds behave identically to Signature thresholds.
These are thresholds that are implemented by the PATROL KMs directly. Events get generated by the KMs and get forwarded directly to the BMC ProactiveNet Server.
Events that are generated by the BMC ProactiveNet Server (by the three native thresholds) are called Internal events. They are marked with icons having a double wrench. The key characteristics of Internal events are:
Define three thresholds for the attribute named Response Time.
Over time, all three thresholds trigger one after the other with their respective severity, in the order of increasing severity (MINOR > MAJOR > CRITICAL) and then, they release in decreasing order of severity. The sequence of action over time is:
mc_ueid, dev1-alr-3322 with MINOR severity.
All events that are not generated by the BMC ProactiveNet Server's native thresholds are considered external and marked with icons having a single wrench.
Metrics that show unambiguous behavior such as up/down, availability, or capacity violation are good candidates for Absolute thresholds.
Signature thresholds are more appropriate for metrics that are more transient, such as response time, packet per second, and so on, where the data can exhibit big swings against a generally upward/downward trend.
PATROL thresholds are appropriate to use for scenarios that are clear-cut, such as device availability, critical capacity overloads, and so on, which do not need to make use of advanced, server-side analytic capabilities. PATROL thresholds have the advantage of quickly triggering actionable events without having to wait for the data to be collected and passed along to the BMC ProactiveNet Server.
Abnormality thresholds generate informational events when key metrics go into exceptional states. These events become useful during troubleshooting scenarios using Probably Cause Analysis (PCA), but are otherwise ignored.
By default, you do not have to do any customization for Abnormality thresholds since they have already been created for all KPIs. However, if you customize the KPI list, ensure that you create new Abnormality thresholds.
Key Performance Indicators (KPIs) are essential metrics for monitoring an infrastructure. They have a direct impact on whether or not baseline computation takes place for corresponding metrics. The following figures show how KPIs may affect baseline generation where the checked boxes indicate that baseline generation gets carried out for those combinations.
When KPI mode is active (BL only for KPI)
Have Abnormality Thresholds
No Abnormality Thresholds
When KPI mode is not active (BL for all metrics)
Have Abnormality Thresholds
No Abnormality Thresholds
In order to function correctly, Abnormality and Signature thresholds require baseline data. Due to this, you may face support issues as certain thresholds would not work. In such cases, ensure that you check whether the baseline is being generated for the metrics in question.
The BMC ProactiveNet Server automatically computes three different types of baselines (Hourly, Daily, and Weekly) to be used by the thresholds. In most cases, when defining thresholds, it is adequate to use Auto Baseline, where the BMC ProactiveNet Server determines the best type of baseline to use for any given metric.
However, if it is known that certain metrics have clear, repeatable hourly patterns (for example, 10 AM on Tuesday behaves in the same way as 10 AM on Wednesday), then you can select Hourly Baseline as the baseline type to use for those corresponding thresholds. Similarly, Daily and Weekly baselines can be used by thresholds if you know that their metrics behave accordingly.
This feature is useful for infrastructures that have recurring periods where (part of) the infrastructure behaves very differently and that they do not want these behaviors to be factored into the normal baselines, for example, major back-up on the last Friday of every month, financial number crunching at the end of every quarter, and so on.
In order for Seasonality baselines to work properly, the BMC ProactiveNet administrator has to ensure that the baseline retention period properly reflects the special recurring period. For example, if the recurring period is twelve-month long, the baseline retention period has to be just as long.
Contact BMC support when retention periods are extended, as they can severely degrade BMC ProactiveNet's system performance.
Thresholds are data-driven – the more available data points, the sooner thresholds can generate some events, especially those that make use of baselines. However, frequent polling intervals will increase the BMC ProactiveNet Server's system load.
When creating a Signature threshold, it is desirable to fine-tune the behavior of the threshold. As shown in the following image, there are four additional fields that become visible when you select the advanced view when creating Signature thresholds.
The field descriptions are:
Description and Usage
Minimum Sampling Window
The minimum span of time, as marked by collected data points, required in order for the Signature threshold engine to initiate evaluation.
Specify this value if you do not want an algorithm to trigger on trivial conditions. For example, if the baseline is low (around 3% - 5%), specify a high threshold value so that Signature thresholds will only trigger if data values are higher than 80% and surpassing baseline.
Use this to expand the baseline value range. Typically, to reduce the sensitivity of the Signature threshold.
The Prediction feature gives early warnings of certain exceptional situations. It is used to issue warnings if there is an aggressive trend towards the threshold. While it can be used in a wide number of scenarios, it is most effective in capacity-type scenarios, especially for those metrics which exhibit clear hourly patterns.
Some of the points to remember when using this feature are:
pw threshold checkpoint command to save states of threshold while customizing deployment.