Investigating events


The Active Events screen provides tools for diagnosing problems in your system. Problems are indicated as low or high deviation from normal. 

Based on your historical data, the product builds models that reflect what is normal in your environment. The real-time or playback data is compared against these models and a Z-Score is calculated for each KPI Group. The Z-Score provides a distance measure of how far a raw value is from its modeled mean in terms of standard deviation units. A High deviation reflects a standard deviation that is twice as large as the Low deviation.

For example, if we define the low deviation limit as ±1.5 standard deviations from the mean, and the high deviation limit as ±3.0 standard deviations from the mean. Then the low deviation state range is between 1.5 and 3 standard deviations from the mean and the High deviation state range is more than three standard deviations from the mean. The model sensitivity dictates the normal range and the ranges of the levels of deviation.

You can change the sensitivity of the product so that the score needs to have a higher variance from the Z-Score to change the state.

For more information, see Setting-the-global-sensitivity-level.

To see the active events

Click here to expand...

To open the Active Events screen, select Active Events from the Events menu.

If BMC AMI Ops Insight has not detected any anomalies, a summary of the monitored KPIs and KPI groups is displayed.

Summary.png

On the top a summary of the events is displayed

image-2023-7-5_19-41-0.png

A tile is displayed for each Db2 Data Sharing Group, Db2 subsystem, and LPAR that is in an anomaly state.

image-2024-3-20_20-42-21.png

Item

Description

1

Name of Db2 Data Sharing Group, Db2 subsystem, or LPAR

2

Indicates whether the anomaly is in a Db2 data sharing group, Db2 subsystem, or an LPAR

3

LPAR and Db2 Data Sharing Group of the Db2 subsystem on which the anomaly is detected for Db2 subsystem events or Sysplex of the LPAR on which the anomaly was detected for LPAR events

4

Time when the event started

5

Duration of the event

6

Classification of the event

7

Drill-down hyperlink to investigate an event

To view events by LPAR or Db2 Data Sharing Group

Click here to expand...

On the Active Events page, click Group by list, and select LPAR or Db2 Data Sharing Group.

The tiles are grouped on the display by LPAR or Db2 Data Sharing Group according to your selection.

To filter specific events, click Filter, check the events you want to view, and then click Apply.
image-2024-3-20_20-43-22.png

To investigate an event

Click here to expand...

To investigate an event, click Investigate in the tile of the event that you want to investigate.

You get the following options to investigate the events:

Click image2023-2-7_12-2-23.pngnext to the event name to see the View Event Summary and the View Event in Timeline.

To analyze the probable cause of an event

Click here to expand...

The Probable Cause Analysis tab displays the most probable sources (classifications) of the anomaly based on the current data. It provides a graphical representation of the current impacted KPI Groups and their interconnections on the most significant paths. 

You can expand the WLM importance total group by clicking any node to display the available WLM importance group in it. The Event Classification section has a filter option to view a subset of information in the classification graph. The Probable Cause Analysis tab contains a graphical representation of the affected KPI Groups.

image-2023-9-13_20-22-52.png

The graphical representation displays an icon for each KPI Group that is defined in the application. Each icon indicates the category the KPI Group belongs to. For details, click Legend (legend_tab.png) on the right of the screen. The KPI Groups are connected by arrows that indicate how they influence each other. These arrows are then included in the paths that are identified to be most likely sources of the problem (classifications).

The icons are color coded as follows:

  • Red: Indicates that the KPI Group has been identified as a source of the current anomaly
    BMC.AMIOPS.SPE2501Click a red node to see the probable cause path. If there are more than one probable cause paths, the first path is highlighted. You can use the classification cards on the left to highlight the other paths.
    If you have installed and configured BMC AMI Assistant, clicking the node also provides an explanation of the path. For more information, see Enabling-BMC-AMI-Assistant.
  • Orange: Indicates high deviation from normal
  • Amber: Indicates low deviation from normal
  • Grey: Indicates normal behavior

Limit breaches are indicated in the graphical representation by a red dot orbiting around the affected KPI Group.

breach_in_classification.png

The Probable Cause Analysis tab includes the following features: 

Item

Feature

1

2

3

4

5

6

7

8

9

Event Timeline

Above the graphical representation is a timeline of the event. You can select a specific point on the timeline to see the state of things at that time. All other areas of the tab are affected by the timeline. Use the arrows (< or >) to jump forward on the timeline or click Event Start or Latest Ingested to go to the beginning of the event or the latest data.

Limit Breaches

Limit breaches are listed on the left, above anomaly classifications.

OI_events.png

Item 

Description

1

The first KPI Group listed in a limit breach is the KPI Group with the breach.

2

Additional KPI Groups that are listed below the breached KPI Group are probable causes of the limit breach.

3

Click the Limit Breach Path button to see the path from a possible cause to the breached KPI Group.

In some cases, the breached KPI Group might be the source of the problem. If that happens, the KPI Group appears in red with the orbiting dot, and no Limit Breach Path is available.

Solo_breach.png

Event Classification

The Event Classification section is as follows:

image-2023-4-6_18-13-55.png


Item

Description

1

Category of the classification KPI Group

2

Name of the classification KPI Group

3

Current Z-Score of the classification KPI Group

4

Name of the BMC AMI Ops view that displays the details about the KPI Group

Click the view name to open the view.

Important

If the link to the view is disabled, make sure the CASID is defined in the data preparation address space, the ENABLE_RCA=true in amipdt.properties, and the target (such as Db2, LPAR) is monitored by BMC AMI Ops Monitor products.

Click Copy_View_Name.pngto copy the command to open BMC AMI Ops view to your clipboard.

5

Checkbox for selecting a single classification path

If a classification KPI Group includes more than one path on the graphical representation, you can select more than one to see how they affect one another.

View_Path.png

6

Click View Detailed Analysis to view the detailed analysis for the classification.

Important

  • Detailed analysis is only available for Contention, I/O, CPU, and Workload classifications.
    To work with the detailed analysis, see Viewing-detailed-analysis.
    To enable it, see Enabling-detailed-analysis.
  • If the View Detailed Analysis option is not available, make sure the CAS ID is defined in the data preparation address space and the target (such as Db2, LPAR) is monitored by BMC AMI Ops Monitor products.
  • The View Detailed Analysis option is enabled after three minutes of an active event.

The View Detailed Analysis window includes:

  • Overview—Displays the identified problem and the most likely cause of the problem based on Workload Probable Cause analysis. 
  • Analysis—Displays the connections between work units involved in the Event and how they affect each other. This can provide you insight into how the problem evolved and how to mitigate it.

For more information on detailed analysis, see Viewing-detailed-analysis.

Limit Breach indicators

Limit breaches appear below the graphical representation at the time they occur. Hover over a limit breach indicator to see the nature of the limit breach and the value.

Breach_below_classification.png

Exceptions

Exceptions appear below the the graphical representation at the time they occur. Hover over an Exception indicator to see the nature of the Exception.

Exceptions are categorized as Db2 Exceptions or z/OS Exceptions.

Exception_In_Classification.png

WLM grouping

The WLM grouping is displayed as an Azure circle including the nodes grouped together.

image2022-6-29_18-1-27.png

Change in classification

Any change in the classification during the running of an event is indicated by image2022-6-28_19-56-11.pngon the Event timeline.

image2022-6-29_18-7-34.png

Click image-2024-2-9_11-32-59.pngbelow the graphical representation to see the details of the anomalies detected:

image-2024-5-17_17-23-50.png

Item

Description

KPI Group

KPI Groups that are in anomaly state at the selected time

Indicators

Classification_Indicator.png—Indicates that the KPI Group is currently identified as a classification KPI Group

Initial_Indicator.png—Indicates that the KPI Group was in anomaly state at the start of the event

Category

Categories of the KPI Groups

Anomaly

Type of anomaly

If the anomaly is a limit breach, it's indicated here.

Limit_on_PCA.png

BMC AMI Ops Views

BMC AMI Ops views that display more details

Click the view name to open the view.

Click Copy_View_Name.pngto copy the command to open BMC AMI Ops view to your clipboard.

Latest Ingestion

The Latest Ingestion displays the latest time and date at which the data was processed. If the Latest Ingestion value does not reflect any change, it might be due to data not getting collected and processed. 

For more information, see Viewing data in the footer bar.

Event End 

The Event End displays when an event ended. It is displayed when you are looking at historical events. If you are in Active Events, the latest ingestion value is displayed.

To view an event progression

Click here to expand...

The Event Progression tab shows how the event developed over time. It is divided into two sections:

  • Event Log—Sequence of KPI Groups going into anomaly and Exceptions occurring
  • Timeline and Graphs—Graphic display of the performance of the KPI Groups and Exceptions over time or individual graphs per KPI Group for all KPI Groups that are in anomaly, with a breakdown to the individual KPIs

To filter the Event Progression

  1. Click Filter.
  2. Select the Categories and KPI Groups that you want to see.
    If you select or deselect a Category, all of the KPI Groups in the Category are selected or deselected. If you select or deselect a KPI Group, the relevant Category is selected but the other KPI Groups in the Category are not selected.
  3. Click Apply.

To see the event Log

The Event Log shows a sequence of KPI Groups going into and out of anomaly state, and exceptions occurring.

image-2023-4-6_18-18-29.png

Item

Description

1

KPI Groups going into anomaly state

KPI Groups are listed here when they go into anomaly state. After getting listed, if there is no change in the state, the KPI Groups are not listed again. KPI Groups that stabilize and go back to anomaly state are relisted.

Exceptions are listed every time they occur and are indicated by a red dot.

2

Time of the anomalies

Click the time to see the time reflected on the timeline, in the graphs, and log. The selected time is indicated on the Event Log by a box: selected_time.png.

  • If you are in Sequence View, the slider on the timeline moves to the selected time.
  • If you are in Graph View, the graphs reflect the KPI Groups performance at the selected time.

3

KPI Groups stabilizing

If the KPI Groups have stabilized after being in an anomaly state, they are listed here.

To see the timeline

The timeline shows the performance of each KPI Group over time during the event.

image-2024-5-17_17-31-43.png

Item

Description

1

Timeline filter

By default, only KPI Groups that are in anomaly state are displayed. Click the filter and select Anomaly & Stable state to display KPI Groups that were in an anomaly state during this event, but are stable at the selected time.

2

Names of the KPI Groups and Exceptions

3

Indicates the current trend for each KPI Group

  • A green up arrow indicates that the metric for the KPI Group is rising.
  • A red down arrow indicates that the metric for the KPI Group is falling.
  • A grey dash indicates that the metric for the KPI Group is holding steady.

4

Timeline showing performance for the KPI Groups and Exception KPIs

  • A plain green line indicates that the KPI Group was stable.
  • A yellow or orange rectangle indicates that the KPI Group was in an anomaly state. The orange rectangles are larger to indicate a stronger deviation from normal. The rectangles appear above or below the line to indicate whether the observed value is higher or lower than normal.
  • A red dot indicates that an Exception occurred.
  • Pink shading indicates that there is a limit breach.
    Breach_Shading.png

Important

Click Legend legend_tab.pngon the right to see the legend describing what you see on the timeline.

5

Slider to see values at a specific time on the timeline

You can also use the < arrow to view what happened earlier in the event. Use the > arrow to move to a later hour.

To see the graphs

Important

This functionality is available only if you have the Docker container installed. For more information, see Installing-BMC-AMI-Ops-Insight .

  • To see graphs for all KPI Groups that are currently in anomaly, click Switch to Graph View.
  • To view the graphs in full screen mode, click image2022-6-29_18-21-38.png. To view it in normal mode, click image2022-6-29_18-23-28.png.image2022-6-29_18-19-38.png
    Graph view displays individual graphs per KPI Group for all KPI Groups that are in anomaly state at the selected time, with a breakdown to the individual KPIs. If you select a different time in the Event Log, the graphs change according to the KPI Groups that were in anomaly state at that time.
    By default, the graphs show the KPI values. To switch to Z-Scores, select the KPI Z-Score option. To switch back, select KPI Value.
  • To see KPI values on a graph for a specific time, hover over the graph.
    Mouse-over_graph.png
  • To see a graph for a single KPI, click the KPI name below the graph.
  • To see a graph for multiple KPIs within a KPI Group, press Ctrl or Shift to select multiple KPIs.
    Graph_Selected_KPIs.png
  • To zoom in a smaller time range, drag your cursor across one of the graphs to select the range. All of the graphs zoom in on the time range you selected.
    Zoom-in_all.png
  • To see an expanded view of a single graph, click the list next to the graph title and select View.
    Zoom-in_on_graph.png
    A more detailed view of the graph is displayed:
    zoomed-in_graph.png
  • To return to the view of all of the KPI Group graphs, click the back arrow or press Esc.
    zoom-out.png
    For z/OS events, the graphs include a filter for WLM importance levels.
    WLM_Graph_filter_buttons.png
    By default, all WLM levels are shown. Click the All button, to open the filter and select individual WLM levels.
    WLM_Graph_filters.png

To return to the previous page

To return to the previous page from Timeline page, click on the Back to pageName link on the top-left of the current page.

Important

The Back to pageName link is not available on drill-down pages.

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*