Investigating events


When you log into BMC AMI Ops Insight, it displays the Active Events screen.

The Active Events screen provides tools for diagnosing problems in your system. Problems are indicated as Low or  High deviation from normal.

Based on your historical data, the product builds models that reflect what normal is in your environment. The realtime or playback data is compared against these models and a z-score is calculated for each KPI Group. A z-score provides a distance measure of how far a raw value is from its modeled mean in terms of standard deviation units. A High deviation reflects a standard deviation that is twice as large as the Low deviation. For example if we define the low deviation limit as ±1.5 standard deviations from the mean, and the high deviation limit as ±3.0 standard deviations from the mean, then the Low deviation state/range is between 1.5 and 3 standard deviations from the mean and the High deviation state/range is more than 3 standard deviations from the mean. The model sensitivity dictates the normal range and the ranges of the levels of deviation. 

You can change the sensitivity of the product so the score needs to have a higher variance from the z-score to change the state. See Setting the sensitivity level for more information.

If BMC AMI Ops Insight has not detected any anomalies, a summary of the monitored KPIs and KPI groups is displayed. 

If anomalies have been detected, a summary of the events is displayed at the top of the page:

A tile is displayed for each sharing group and each subsystem that is in an anomaly state. The following information is displayed for each sharing group or subsystem:

1

Name of sharing group or subsystem

2Indication whether the anomaly is in a sharing group or a subsystem
3

If the anomaly is in a subsystem, LPAR and sharing group of the subsystem

4

Time the event started

5

Elapsed time since the event started

To view events by LPAR or sharing group

Click Group by, and select LPAR or Sharing Group.

The tiles are grouped by LPAR or sharing group according to your selection. 

To investigate an event

To investigate an event, click on Investigate in the of the event you want to investigate. 

The following tabs are available:

  • Probable Cause Analysis—Information that helps you identify the source of the event
  • Event Progression—Details of the event over time


    The Probable Cause Analysis tab displays the following:


    1

    Status of the Categories at the start of the event

    If you select a sharing group event, only the categories that are relevant to sharing groups are displayed: Contention, I/O, and Workload.

    2

    KPI Groups and Exceptions that were anomaly state at the start of the event

    3

    Categories of the KPI Groups

    4

    Type of anomaly

    5

    MainView view where you can see more details

    Click to copy the MainView command for accessing the view.

    6 Button for viewing the Event Progression tab.
    7

    Active events that are in the same sharing group or LPAR as this event

    The Event Progression tab shows how the event developed over time. It is divided into two sections:

    • Time sequence—Sequence of KPI Groups going into anomaly and Exceptions occurring
    • Timeline—Graphic display of the performance of the KPI Groups and Exceptions

    To filter the Event Progression tab

    You can filter the Event Progression tab as follows:

    1. Click Filter.
    2. Select the Categories and KPI Groups that you want to see.
      If you select or deselect a Category, all of the KPI Groups in the Category are selected or deselected. If you select or deselect a KPI Group, the relevant Category is selected but the other KPI Groups in the Category are not selected.
    3. Click Apply.

    Time sequence

    The time sequence shows a sequence of KPI Groups going into and out of anomaly state, and Exceptions occurring.

    1

    KPI Groups going into anomaly state

    KPI Groups are listed here when they go into anomaly state, they are not listed if they stay in anomaly state. KPI Groups that stabilize and go back to anomaly state, are relisted.

    Exceptions are listed every time they occur. Exceptions are indicated by a red dot.

    2

    Time of the anomalies

    Click on the time to see that time reflected on the timeline. The selected time is indicated on the Time Sequence by a box: . The scroller on the timeline moves to the selected time.

    3

    KPI Groups stabilizing

    Timeline

    The timeline shows the performance of each KPI Group over time during the event.

    1

    Names of the KPI Groups and Exceptions

    2

    Filter for the timeline

    By default, only KPI Groups that are in anomaly state are displayed. To display all KPI Groups, click the filter and select Anomaly & Stable state.

    3

    Timeline showing performance for the KPI Groups and Exception KPIs

    • A plain green line indicates that the KPI Group was stable.
    • A yellow or orange rectangle indicates that the KPI Group was in an anomaly state. The orange rectangles are larger to indicate a stronger deviation from normal.
    • A red dot indicates that an Exception occurred.

    Note

    Click the tab on the right to see the legend describing what you see on the timeline.

    4

    Indicators of the current trend for each KPI Group

    • A green up arrow indicates that the metric for the KPI Group is rising.
    • A red down arrow indicates that the metric for the KPI Group is falling.
    • A grey dash indicates that the metric for the KPI Group is holding steady.
    5

    Draggable scroller to see values at a specific time on the timeline

    6 Projected status for the future based on the current trends
    Was this page helpful? Yes No Submitting... Thank you

    Comments