Product overview

BMC Helix Operations Management is a SaaS solution on BMC Helix Portal that combines broad capabilities across monitoring and event management with a cloud native containerized microservices architecture that enables fast deployment and upgrades, elastic scalability, enterprise grade high-availability and performance along with the reduced infrastructure costs that come with a SaaS deployment model. The solution features a modern user experience and automated workflows to streamline monitoring and event management processes and enables large scale ingestion of events and metrics.

Collect data

The product collects metrics about the components that you are monitoring in your infrastructure, such as the Oracle database or the Windows operating system. You monitor the environment with the help of PATROL Agents and knowledge modules (KMs).

For more information, see Collecting-data.

Monitor events and reduce event noise

As an administrator, identify actionable events from a large volume of event data by processing events in various ways.

As an operator, use a centralized event view to monitor and manage events.

For more information, see Monitoring-events-and-reducing-event-noise.

Detect anomalies in the system

Anomalies are observations that diverge from a well-structured data pattern or an irregular spike in the time-series data or unclassifiable data points within a specific data set. An anomaly could occur independently or due to a combination of factors. For example, the combination of slow response time and high memory utilization together may impact the expected system behavior.

As an administrator, create alarm and variate policies to help you monitor and manage the health of your system and detect anomalies. These policies can also help you detect abnormal behavior in your monitoring data more accurately by reducing:

False positives: Scenarios where an alarm is raised even though the system exhibits normal behavior.
False negatives: Scenarios where the product failed to raise an alarm despite the occurrence of an abnormal metric condition.

For more information, see Detecting-anomalies-by-using-static-and-dynamic-thresholds.

Monitor and investigate services and situations

You can monitor system health, reduce event noise, perform probable cause analysis of impacted services, and boost the remediation opportunities for services and situations in your environment.

The key performance indicators (KPIs) provide a quick-peek summary of the overall system health status. Situations reduce event noise by dynamically aggregating events based on an event correlation policy to derive actionable insights. You can perform probable cause analysis for impacted services.

For more information, see Monitoring-and-investigating-services-and-situations.

Manage maintenance windows

During maintenance time periods, you don't want to receive events indicating that your services are unavailable. As an administrator, configure blackout policies to suppress unwanted events or ignore events for a specific time period. If you define your scheduled maintenance windows correctly, the blacked out events are not included in the performance monitoring of your infrastructure environment.

For more information, see Managing-maintenance-windows.

Create interactive dashboards

Failed to execute the [excerpt-include] macro.

For more information, see Viewing-collected-data.