Investigating events


To open the Active Events screen, select Active Events from the Events menu.

The Active Events screen provides tools for diagnosing problems in your system. Problems are indicated as Low or High deviation from normal.

Based on your historical data, the product builds models that reflect what normal is in your environment. The realtime or playback data is compared against these models and a Z-Score is calculated for each KPI Group. The Z-Score provides a distance measure of how far a raw value is from its modeled mean in terms of standard deviation units. A High deviation reflects a standard deviation that is twice as large as the Low deviation.

For example, if we define the low deviation limit as ±1.5 standard deviations from the mean, and the high deviation limit as ±3.0 standard deviations from the mean, then the Low deviation state range is between 1.5 and 3 standard deviations from the mean and the High deviation state range is more than 3 standard deviations from the mean. The model sensitivity dictates the normal range and the ranges of the levels of deviation.

You can change the sensitivity of the product so the score needs to have a higher variance from the Z-Score to change the state. For more information, see Setting-the-sensitivity-level.

If BMC AMI Ops Insight has not detected any anomalies, a summary of the monitored KPIs and KPI groups is displayed.

Summary.png

If anomalies are detected, a summary of the events is displayed at the top of the page:

Active_Events_summary.png

A tile is displayed for each sharing group, subsystem, and LPAR that is in an anomaly state, as follows:

Active_Event_Tile.png

Item

Description

1

Name of sharing group, subsystem, or LPAR

2

Indication whether the anomaly is in a sharing group, subsystem, or an LPAR

3

LPAR and sharing group of the subsystem on which the anomaly is detected for subsystem events

Sysplex of the LPAR on which the anomaly was detected for LPAR events

4

Time the event started

5

Elapsed time since the event started

6

Classification of the event

To view events by LPAR or sharing group

Click Group by, and select LPAR or Sharing Group.

The tiles are grouped on the display by LPAR or sharing group according to your selection.

To investigate an event

To investigate an event, click Investigate in the tile of the event you want to investigate.

The following tabs are available:

  • Probable Cause Analysis — Information that helps you identify the source of the event
  • Event Progression — Details of the event over time


The Probable Cause Analysis tab displays the current most probable sources (classifications) of the anomaly based on the current data. It provides a graphical representation of the current impacted KPI Groups and the connections between them on the most critical paths.

Classification.png

Below the graphical representation, the affected KPI Groups are listed with more details.

Graphical representation of affected KPI Groups

The tab contains a graphical representation of the affected KPI Groups and includes the following sections:

Item

Section

1

2

3

4

5

The graphical representation displays an icon for each KPI Group that is defined in the application. Each icon indicates the category the KPI Group belongs to. For details, click Legend (legend_tab.png) on the right of the screen. The KPI Groups are connected by arrows that indicate how they influence each other. These arrows are then included in the paths that are identified to be most likely sources of the problem, (classifications).

The icons are color coded as follows:

Color

Description

Red

Classification

This indicates that the KPI Group has been identified as a source of the current anomaly.

Orange

High deviation from normal

Amber

Low deviation from normal

Grey

Normal

Limit breaches are indicated in the graphical representation by a red dot orbiting around the affected KPI Group.

breach_in_classification.png

1. Event timeline

Above the graphical representation is a timeline of the event. You can select a specific point on the timeline to see the state of things at that time. All other areas of the tab are affected by the timeline. Use the arrows (< or >) to jump forward on the timeline or click Event Start or Latest Ingested to go to the beginning of the event or the latest data.

2. Limit Breaches

Limit breaches are listed on the left, above anomaly classifications.

1

The first KPI Group listed in a limit breach is the KPI Group with the breach.

2

Additional KPI Groups that are listed below the breached KPI Group are probable causes of the limit breach.

3

Click the Limit Breach Path button to see the path from a possible cause to the breached KPI Group.

In some cases, The breached KPI Group is also the source of the problem. If that happens the KPI Group appears in red with the orbiting dot, and no Limit Breach Path is available.

Solo_breach.png

Breach_Details.png

3. Event Classification

The Event Classification section is displayed as follows:

Classification_Details.png

Each tile shows the following details:

Item

Description

1

Category of the classification KPI Group

2

Name of the classification KPI Group

3

Current Z-Score of the classification KPI Group

4

Name of the BMC AMI Ops view that displays more details about the KPI Group

Click the name of the view to open the view.

Click Copy_View_Name.png to copy the command for opening the view in BMC AMI Ops to your clipboard.

5

Checkbox for selecting a single classification path

If a classification KPI Group includes more than one path on the graphical representation, you can select more than one to see how they affect one another.

View_Path.png

6

Click View Detailed Analysis to view the detailed analysis for the classification.

Note

Detailed analysis is only available for certain classifications.

Detailed analysis is only available if you have enabled it. See Enabling-detailed-analysis for more information.

Contention_details.png

The details are divided into two sections:

  • Overview — The overview is divided into two parts:
    • The top part displays details of the identified problem. Click Show Details to see more information.
    • The bottom part displays details of the most likely cause of the problem based on our analysis.
  • Analysis — The analysis section displays the connections between work units involved in the Event and how they affect each other. This can provide you insight into how the problem evolved and how to mitigate it.

4. Limit Breach indicators

Limit breaches appear below the graphical representation at the time they occur. Hover over a limit breach indicator to see the nature of the limit breach and the value.

Breach_below_classification.png

5. Exceptions

Exceptions appear below the the graphical representation at the time they occur. Hover over an Exception indicator to see the nature of the Exception:

Exception_In_Classification.png

Exceptions are categorized as Db2 Exceptions or z/OS Exceptions.

Table

The table below the graphical representation displays the following details:

Detail

Description

KPI Group

KPI Groups that are in anomaly state at the selected time

Indicators

Classification_Indicator.png- indicates that the KPI Group is currently identified as a classification KPI Group

Initial_Indicator.png- indicates that the KPI Group was in anomaly state at the start of the event

Category

Categories of the KPI Groups

Anomaly

Type of anomaly

If the anomaly is a limit breach, it's indicated here:

Limit_on_PCA.png

BMC AMI Ops Views

BMC AMI Ops view that displays more details

Click the view name to open the view.

Click Copy_View_Name.png to copy the command for opening the view in AMI Ops to your clipboard.

The Event Progression tab shows how the event developed over time. It is divided into two sections:

  • Event Log—Sequence of KPI Groups going into anomaly and Exceptions occurring
  • Timeline/Graphs—Graphic display of the performance of the KPI Groups and Exceptions over time or Individual graphs per KPI Group for all KPI Groups that are in anomaly, with a breakdown to the individual KPIs

To filter the Event Progression tab

  1. Click Filter.
  2. Select the Categories and KPI Groups that you want to see.
    If you select or deselect a Category, all of the KPI Groups in the Category are selected or deselected. If you select or deselect a KPI Group, the relevant Category is selected but the other KPI Groups in the Category are not selected.
  3. Click Apply.

Event Log

The Event Log shows a sequence of KPI Groups going into and out of anomaly state, and Exceptions occurring.

Progression_Time_Sequence.png

Item

Description

1

KPI Groups going into anomaly state

KPI Groups are listed here when they go into anomaly state. They are not listed if they stay in anomaly state. KPI Groups that stabilize and go back to anomaly state are relisted.

Exceptions are listed every time they occur. Exceptions are indicated by a red dot.

2

Time of the anomalies

Click the time to see that time reflected on the timeline, in the graphs, and in the log. The selected time is indicated on the Event Log by a box: selected_time.png.

  • If you are in Sequence View, the slider on the timeline moves to the selected time.
  • If you are in Graph View, the graphs reflect the KPI Groups performance at the selected time.

3

KPI Groups stabilizing

Timeline

The timeline shows the performance of each KPI Group over time during the event.

Progression_timeline.png

Item

Description

1

Names of the KPI Groups and Exceptions

2

Timeline filter

By default, only KPI Groups that are in anomaly state are displayed. Click the filter and select Anomaly & Stable state to display KPI Groups that were in an anomaly state during this event, but are stable at the selected time.

3

Timeline showing performance for the KPI Groups and Exception KPIs

  • A plain green line indicates that the KPI Group was stable.
  • A yellow or orange rectangle indicates that the KPI Group was in an anomaly state. The orange rectangles are larger to indicate a stronger deviation from normal. The rectangles appear above or below the line to indicate whether the observed value is higher or lower than normal.
  • A red dot indicates that an Exception occurred.
  • Pink shading indicates that there is a limit breach.
    Breach_Shading.png

Note

Click Legend legend_tab.png on the right to see the legend describing what you see on the timeline.

4

Indicators of the current trend for each KPI Group

  • A green up arrow indicates that the metric for the KPI Group is rising.
  • A red down arrow indicates that the metric for the KPI Group is falling.
  • A grey dash indicates that the metric for the KPI Group is holding steady.

5

Slider to see values at a specific time on the timeline

You can also use the < arrow to view what happened earlier in the event. Use the > arrow to move to a later hour.

6

Projected status for the future based on the current trends

Graphs

Note

This functionality is available only if you have the Docker container installed. For more information, see Installing-BMC-AMI-Ops-Insight .

To see graphs for all KPI Groups that are currently in anomaly, click Switch to Graph View.

Graph_View.png

Graph view displays individual graphs per KPI Group for all KPI Groups that are in anomaly state at the selected time, with a breakdown to the individual KPIs. If you select a different time in the Event Log, the graphs change according to the KPI Groups that were in anomaly state at that time.

By default, the graphs show the KPI values. To switch to Z-Scores, select the KPI Z-Score option. To switch back, select KPI Value.

To see KPI values on a graph for a specific time, hover over the graph.

Mouse-over_graph.png

To see a graph for a single KPI, click the KPI name below the graph. To see a graph for multiple KPIs within a KPI Group, press Ctrl or Shift to select multiple KPIs.

Graph_Selected_KPIs.png

To zoom in a smaller time range, drag your cursor across one of the graphs to select the range. All of the graphs zoom in on the time range you selected.

Zoom-in_all.png

To see an expanded view of a single graph, click the list next to the graph title and select View.

Zoom-in_on_graph.png

A more detailed view of the graph is displayed:

zoomed-in_graph.png

To return to the view of all of the KPI Group graphs, click the back arrow or press Esc.

zoom-out.png

For z/OS events, the graphs include a filter for WLM importance levels.
WLM_Graph_filter_buttons.png
By default, all WLM levels are shown. Click the All button, to open the filter and select individual WLM levels.
WLM_Graph_filters.png

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*