Product overview
With this product, you can achieve the following major goals:
- Collect data to monitor your infrastructure environment
- Monitor events and reduce event noise
- Detect anomalies in the system
- Manage maintenance windows
- Gain insight into the system with logs
- Monitor and investigate services and situations
Collect data
The product collects metrics about the components that you are monitoring in your infrastructure, such as the Oracle database or the Windows operating system. You monitor the environment with the help of PATROL Agents and knowledge modules (KMs).
For more information, see Collecting-data.
Monitor events and reduce event noise
As an administrator, identify actionable events from a large volume of event data by processing events in various ways.
As an operator, use a centralized event view to monitor and manage events.
For more information, see Monitoring-events-and-reducing-event-noise.
Detect anomalies in the system
Anomalies are observations that diverge from a well-structured data pattern or an irregular spike in the time-series data or unclassifiable data points within a specific data set. An anomaly could occur independently or due to a combination of factors. For example, the combination of slow response time and high memory utilization together may impact the expected system behavior.
As an administrator, create alarm and variate policies to help you monitor and manage the health of your system and detect anomalies. These policies can also help you detect abnormal behavior in your monitoring data more accurately by reducing:
- False positives: Scenarios where an alarm is raised even though the system exhibits normal behavior.
- False negatives: Scenarios where the product failed to raise an alarm despite the occurrence of an abnormal metric condition.
For more information, see Detecting-anomalies-by-using-static-and-dynamic-thresholds.
Monitor and investigate services and situations
You can monitor system health, reduce event noise, perform probable cause analysis of impacted services, and boost the remediation opportunities for services and situations in your environment.
The key performance indicators (KPIs) provide a quick-peek summary of the overall system health status. Situations reduce event noise by dynamically aggregating events based on an event correlation policy to derive actionable insights. You can perform probable cause analysis for impacted services.
For more information, see Monitoring-and-investigating-services-and-situations.
Manage maintenance windows
During maintenance time periods, you don't want to receive events indicating that your services are unavailable. As an administrator, configure blackout policies to suppress unwanted events or ignore events for a specific time period. If you define your scheduled maintenance windows correctly, the blacked out events are not included in the performance monitoring of your infrastructure environment.
For more information, see Managing-maintenance-windows.
Create interactive dashboards
Failed to execute the [excerpt-include] macro. Cause: [Error number 2 in 0: No wiki with id [confluencePage:page] could be found]. Click on this message for details.
For more information, see Viewing-collected-data.