An alarm policy helps you monitor and manage the health of your system and safeguard against abnormalities. You can define thresholds in the alarm policy. Thresholds define an acceptable value above or below which an alarm is generated. These policies can be viewed and created by tenant administrators only.
Anomalies are observations that diverge from a well-structured data pattern. BMC Helix Operations Management displays the anomaly event plotted in a graph to easily distinguish the anomaly data point from the regular data point. An anomaly event is generated based on the anomaly score computation for the metrics configured in variate policies.
A baseline is the expected normal operating range for a metric or attribute of a monitor. The baseline is calculated by collecting the values for a monitor's attributes and performance metrics over a specific time period. Baseline baseline calculation begins after six hours of aggregate data is available for a metric. You can specify baseline options while configuring alarm policies. When viewing device details, if baseline is selected for any supported metric, the graph shows the hourly high and hourly low baselines. When the metric breaches the static threshold and the high or low baseline, an event is generated.
A backout policy enables you to blackout events originating from various devices without stopping data collection. In this policy, you can configure a specific time period during which notifications about particular incoming events are ignored. The selection criteria in the blackout policy defines the conditions based on which events are selected for blackout. You can also configure various blackout actions to be taken in addition to blacking out new events matching the selection criteria.
The PATROL Agent configuration is saved in a set of configuration variables that are stored in the Agent's configuration database. You can control the PATROL Agent configuration by changing the values of these configuration variables. After you define a configuration variable, the definitions are enabled on the PATROL Agent when the policy is applied to it. Configuration variables that are defined on PATROL agents are retained even after the policy is disabled. However, you can delete the configuration variables by purging the PATROL Agent.
Dashboard and Widgets
Dashboard enables you to graphically visualize the current state of various types of entities, metrics, and performance data in your environment. The Dashboard Widgets enable you to create customized views of the data from your environment.
After you configure the monitor types in the PATROL Agents, the collected data and is sent to BMC Helix Operations Management . You can restrict the data from being sent to BMC Helix Operations Management by deactivating Agent-side data collection.
You can configure the following types of filters:
- Attribute-level filter - When you apply this filter, only the instance information is sent to BMC Helix Operations Management . Information about the attributes of the instance is not sent to BMC Helix Operations Management .
Instance-level filter - When you apply this filter, information about attributes and instances is not sent to BMC Helix Operations Management . If you have instances in a parent-child hierarchy, you must configure instance-level filters separately for each sub-node of the parent and child.
Note: Possible delay in the deployment of policy rules
A PATROL Agent applies the policy for deactivating collection during the next scheduled discovery or collection of a PATROL object (monitor type or attribute). This means the policy is applied when the first collection occurs on the PATROL object after the policy rules are applied. Due to this behavior, there might be a delay in the deployment of the policy rules on the PATROL object.
A deployable package consists of monitoring solution components that you can select and install together. These components are also called installation components. You can reuse deployable packages on a single or multiple computers.
A device is any entity that can be monitored by BMC Helix Operations Management . You can create a device in BMC Helix Operations Management by defining a monitoring policy.
BMC Helix Operations Management collects data from many sources. Built-in algorithms convert the information collected about entities into devices, determine whether two or more pieces of data refer to the same entity, and consolidate those entities into a single device.
An event is a notification that indicates a change in the state of an application or device that you are monitoring. Thus, an event can represent an error or warning, it can mean the crossing of a set threshold, or it can mean everything is working as expected, and so on. Alarms are type of events that are generated when the user-specified threshold values are violated. PATROL events are events generated by the PATROL Agent.
BMC Helix Operations Management provides machine-guided event analysis and intelligently groups events into clusters. You can easily identify hotspots in an event flood.
An event class is a classification of the types of events that BMC Helix Operations Management accepts and further classifies the source event information for processing. For example, suppose Microsoft Windows operating system events is defined as one type of event class. This class has several subclasses: application events, security events, and system events. Each event subclass represents a common type of event that occurs, for example, a network login event. Each network login event contains varying data values, such as the timestamp and the name of the host on which the login occurred. The varying data values from a source event are stored in data fields called slots (sometimes known as attributes). An actual event is an instance of an event class.
BMC Helix Operations Management
comes with the base EVENT class (parent class with all the basic slots), PATROL class (events raised by PATROL Agents), and ALARM class (alarm events raised by the alarm policies).
Event noise is the term used to describe the hundreds of hourly and daily notifications and alarms (for example, CPU utilization, memory utilization, or end user response time) delivered by monitoring systems to IT operations teams to show the health and performance of infrastructure and applications across their IT environment.
Event noise reduction
Event noise reduction involves reducing the event storm by combining multiple matching events into a single, aggregated event. Event noise reduction enables you to perform prioritized triage and remediation.
An event policy helps you process events and set up routine event management actions quickly and easily. In this policy, you can define actions that must be run on the occurrence of particular events. You can define various types of event policies based on the type of actions that you want to run. For example, you can create an enrichment policy to refine particular attribute values of the event and make the event more meaningful. A notification policy can be used to notify users via email that an event has occurred, and so on.
Slots identify information within an event class. When you view an event, you can see various key pieces of information identified as slots. Slots can be used in various ways. For example, you can use them for filtering, to enrich events by creating an event policy, and so on.
Each event class has defined slots. Some slots are common to all event classes, while others are unique to an event class. The default slots in the event list provide basic information about an event. BMC Helix Operations Management comes with a set of out-of-the-box (or default) slots. These slots are included when a new event occurs. However, some slots are internal and therefore hidden from the BMC Helix Operations Management console . You can also define custom slots by running the API for creating a custom class.
Slot definitions specify the slot types (or data types) that are acceptable for processing. Slot values can have various data types, such as String, Integer, Long, List of String, List of Integer, and so on. Some slots are marked as Enum (or enumeration). These slots are defined with an acceptable list of values. For example, the
status slot can only have these values:
Enumeration (or Enum) data types specify acceptable values for a particular event slot. An enumeration associates constant values for an event slot. To specify a list of values for an Enum data type event slot, you can use out-of-the-box enumerations or use a custom list of values created by using custom enumerations.
Each group is a logical collection of monitored devices in BMC Helix Operations Management . Groups allow you to filter and group impacted devices based on an entity selection query.
An incident is any event that is not part of the standard operation of a service and that causes an interruption or a reduction in the quality of that service.
Policies are a set of rules that enable administrators to automatically deploy configurations on PATROL Agents and monitoring solutions. A policy is applied to the PATROL Agents based on conditions such as Agent name, Agent port, Agent version, Agent tag, and so on. If an Agent matches the conditions defined in the policies, the policies are applied to the Agent.
A Knowledge Module (KM) is a set of files from which a PATROL Agent receives information about resources running on a monitored computer. A KM file can contain the actual instructions for monitoring objects or simply a list of KMs to load. KMs are loaded by a PATROL Agent.
Latency is the amount of time a transaction took to complete.
Monitor type (also referred to as Application Classes) is the object class to which application instances belong. A monitor type is a way of classifying the data that is to be collected.
A monitoring profile is a profile to which the monitor types you want to enable are associated. Each solution contains multiple monitoring profiles and help to reduce unnecessary monitoring. Each monitoring profile is associated with a group of monitor types. The monitor types that belong to a profile are pre-determined. You cannot add or remove monitor types from a profile.
A monitoring solution is a pre-defined set of metrics that monitor the health and performance of a specific device or service. BMC monitoring solutions are composed of monitor types and attributes.
PATROL Agent is a framework that provides an interface to collect information about the resources of a monitored computer through a set of files called Knowledge Modules (KMs).
A situation comprises events associated with a host that have been aggregated based on when they occurred, their message content, and their relationships to the service or application topology. The policy-based (also known as rule-based) situation uses a correlation event policy to aggregate events and identify situations in the system.
A data poll is the frequency at which you want to collect data. In an monitor policy, you can specify the time interval between consecutive data polls.
Probable cause analysis (PCA)
Probable cause analysis (PCA) is the ability to determine the most likely causes of any issue in an infrastructure environment by correlating millions of monitoring data points and analyzing the relationship between infrastructure nodes and services. The goal is to reduce the mean time to identify or determine (MTTI) and mean time to resolve (MTTR) for issues.
Role-based access (RBAC) to the features and components in BMC Helix Operations Management is enabled by persona-based authorization profiles.
A repository is a bundle of monitoring solutions and knowledge modules (KMs) that are shipped together in BMC Helix Operations Management .
A service is a logical group of applications, middleware, security, storage, networks, and other subservices that work together to achieve a comprehensive, end-to-end business goal. HR service, administrative service, and payroll service are a few examples of business services.
A service topology is a graphical representation that shows a dynamic and pictorial view of a service model.
Service health and impact score
The service health score and service impact score are the two most important indicators of service health. The health and impact scores provide a quick insight into service health and enable you to take timely action.
Start anywhere application modeling is a quick and easy approach to modeling, which enables you to choose any entry points into an application or Business Service and begin modeling from there.
A Business Application is a system that provides a business function to users or customers of the business. Applications generally involve multiple separate pieces of software such as application servers and databases, plus network services such as load balancers.