Cloud aggregate monitors
All of the cloud entities (pods, network containers, and compute pools) aggregate key performance metrics for CPU and memory utilization, disk IOPS and network throughput. Additionally, the aggregate availability and aggregate status is also tracked for the cloud constructs.
The metrics are rolled-up from the underlying entities in the cloud topology (anchored by location or pod) and infrastructure (compute pool) hierarchies. The aggregated values provide a measure of the corresponding performance, availability, and health statistics for the cloud constructs as a whole. While the aggregate metrics help to track the general behavior of the cloud construct and help support analytics for the same, BMC recommends leveraging the out-of-the-box stacked charts for a cloud construct, to gain insight into how the underlying elements contribute to the aggregated metric. This is especially useful to gauge individual element behavior while troubleshooting a cloud construct.
Just like any other monitor type in Infrastructure Management, cloud aggregate monitors support the entire range of analytic capabilities available in Infrastructure Management. You can set global or instance level thresholds (to be used with or without baselines) to define conditions to be alerted upon from the cloud environment perspective. The aggregate status monitor also introduces a grouping based approach in which you can specify thresholds based upon the percentage of children being in a certain state. For example, you can create a rule that triggers a critical alarm if 30% of the child nodes are in an impacted state or if 5% of the child nodes are in an unavailable state.
You can initiate Probable Cause Analysis on an event raised on one of the constituent elements of the compute pool. To better support Probable Cause Analysis on the compute pool, BMC recommends creating a service model to represent the important infrastructure underlying and supporting the compute pool.