Investigating events
The Active Events screen provides tools for diagnosing problems in your system. Problems are indicated as low or high deviation from normal.
Based on your historical data, the product builds models that reflect what is normal in your environment. The real-time or playback data is compared against these models and a Z-Score is calculated for each KPI Group. The Z-Score provides a distance measure of how far a raw value is from its modeled mean in terms of standard deviation units. A High deviation reflects a standard deviation that is twice as large as the Low deviation.
For example, if we define the low deviation limit as ±1.5 standard deviations from the mean, and the high deviation limit as ±3.0 standard deviations from the mean. Then the low deviation state range is between 1.5 and 3 standard deviations from the mean and the High deviation state range is more than three standard deviations from the mean. The model sensitivity dictates the normal range and the ranges of the levels of deviation.
You can change the sensitivity of the product so that the score needs to have a higher variance from the Z-Score to change the state.
For more information, see Setting-the-global-sensitivity-level.
To see the active events
Click here to expand...
To open the Active Events screen, select Active Events from the Events menu.
If BMC AMI Ops Insight has not detected any anomalies, a summary of the monitored KPIs and KPI groups is displayed.
On the top a summary of the events is displayed
A tile is displayed for each Db2 Data Sharing Group, Db2 subsystem, and LPAR that is in an anomaly state.
Item | Description |
---|---|
1 | Name of Db2 Data Sharing Group, Db2 subsystem, or LPAR |
2 | Indicates whether the anomaly is in a Db2 data sharing group, Db2 subsystem, or an LPAR |
3 | LPAR and Db2 Data Sharing Group of the Db2 subsystem on which the anomaly is detected for Db2 subsystem events or Sysplex of the LPAR on which the anomaly was detected for LPAR events |
4 | Time when the event started |
5 | Duration of the event |
6 | Classification of the event |
7 | Drill-down hyperlink to investigate an event |
To view events by LPAR or Db2 Data Sharing Group
Click here to expand...
On the Active Events page, click Group by list, and select LPAR or Db2 Data Sharing Group.
The tiles are grouped on the display by LPAR or Db2 Data Sharing Group according to your selection.
To filter specific events, click Filter, check the events you want to view, and then click Apply.
To investigate an event
Click here to expand...
To investigate an event, click Investigate in the tile of the event that you want to investigate.
You get the following options to investigate the events:
- Probable Cause Analysis—Information that helps you identify the source of the event
- Event Progression—Details of the event over time
Click next to the event name to see the View Event Summary and the View Event in Timeline.
To analyze the probable cause of an event
Click here to expand...
The Probable Cause Analysis tab displays the most probable sources (classifications) of the anomaly based on the current data. It provides a graphical representation of the current impacted KPI Groups and their interconnections on the most significant paths.
You can expand the WLM importance total group by clicking any node to display the available WLM importance group in it. The Event Classification section has a filter option to view a subset of information in the classification graph. The Probable Cause Analysis tab contains a graphical representation of the affected KPI Groups.
The graphical representation displays an icon for each KPI Group that is defined in the application. Each icon indicates the category the KPI Group belongs to. For details, click Legend () on the right of the screen. The KPI Groups are connected by arrows that indicate how they influence each other. These arrows are then included in the paths that are identified to be most likely sources of the problem (classifications).
The icons are color coded as follows:
- Red: Indicates that the KPI Group has been identified as a source of the current anomaly
BMC.AMIOPS.SPE2501Click a red node to see the probable cause path. If there are more than one probable cause paths, the first path is highlighted. You can use the classification cards on the left to highlight the other paths.
If you have installed and configured BMC AMI Assistant, clicking the node also provides an explanation of the path. For more information, see Enabling-BMC-AMI-Assistant. - Orange: Indicates high deviation from normal
- Amber: Indicates low deviation from normal
- Grey: Indicates normal behavior
Limit breaches are indicated in the graphical representation by a red dot orbiting around the affected KPI Group.
The Probable Cause Analysis tab includes the following features:
Item | Feature |
---|---|
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 |
Event Timeline
Above the graphical representation is a timeline of the event. You can select a specific point on the timeline to see the state of things at that time. All other areas of the tab are affected by the timeline. Use the arrows (< or >) to jump forward on the timeline or click Event Start or Latest Ingested to go to the beginning of the event or the latest data.
Limit Breaches
Limit breaches are listed on the left, above anomaly classifications.
Item | Description |
---|---|
1 | The first KPI Group listed in a limit breach is the KPI Group with the breach. |
2 | Additional KPI Groups that are listed below the breached KPI Group are probable causes of the limit breach. |
3 | Click the Limit Breach Path button to see the path from a possible cause to the breached KPI Group. |
In some cases, the breached KPI Group might be the source of the problem. If that happens, the KPI Group appears in red with the orbiting dot, and no Limit Breach Path is available.
Event Classification
The Event Classification section is as follows:
Item | Description |
---|---|
1 | Category of the classification KPI Group |
2 | Name of the classification KPI Group |
3 | Current Z-Score of the classification KPI Group |
4 | Name of the BMC AMI Ops view that displays the details about the KPI Group Click the view name to open the view. Important If the link to the view is disabled, make sure the CASID is defined in the data preparation address space, the ENABLE_RCA=true in amipdt.properties, and the target (such as Db2, LPAR) is monitored by BMC AMI Ops Monitor products. Click |
5 | Checkbox for selecting a single classification path If a classification KPI Group includes more than one path on the graphical representation, you can select more than one to see how they affect one another. |
6 | Click View Detailed Analysis to view the detailed analysis for the classification. Important
The View Detailed Analysis window includes:
For more information on detailed analysis, see Viewing-detailed-analysis. |
Limit Breach indicators
Limit breaches appear below the graphical representation at the time they occur. Hover over a limit breach indicator to see the nature of the limit breach and the value.
Exceptions
Exceptions appear below the the graphical representation at the time they occur. Hover over an Exception indicator to see the nature of the Exception.
Exceptions are categorized as Db2 Exceptions or z/OS Exceptions.
WLM grouping
The WLM grouping is displayed as an Azure circle including the nodes grouped together.
Change in classification
Any change in the classification during the running of an event is indicated by on the Event timeline.
Click below the graphical representation to see the details of the anomalies detected:
Item | Description |
---|---|
KPI Group | KPI Groups that are in anomaly state at the selected time |
Indicators |
|
Category | Categories of the KPI Groups |
Anomaly | Type of anomaly If the anomaly is a limit breach, it's indicated here. |
BMC AMI Ops Views | BMC AMI Ops views that display more details Click the view name to open the view. Click |
Latest Ingestion
The Latest Ingestion displays the latest time and date at which the data was processed. If the Latest Ingestion value does not reflect any change, it might be due to data not getting collected and processed.
For more information, see Viewing data in the footer bar.
Event End
The Event End displays when an event ended. It is displayed when you are looking at historical events. If you are in Active Events, the latest ingestion value is displayed.
To view an event progression
Click here to expand...
The Event Progression tab shows how the event developed over time. It is divided into two sections:
- Event Log—Sequence of KPI Groups going into anomaly and Exceptions occurring
- Timeline and Graphs—Graphic display of the performance of the KPI Groups and Exceptions over time or individual graphs per KPI Group for all KPI Groups that are in anomaly, with a breakdown to the individual KPIs
To filter the Event Progression
- Click Filter.
- Select the Categories and KPI Groups that you want to see.
If you select or deselect a Category, all of the KPI Groups in the Category are selected or deselected. If you select or deselect a KPI Group, the relevant Category is selected but the other KPI Groups in the Category are not selected. - Click Apply.
To see the event Log
The Event Log shows a sequence of KPI Groups going into and out of anomaly state, and exceptions occurring.
Item | Description |
---|---|
1 | KPI Groups going into anomaly state KPI Groups are listed here when they go into anomaly state. After getting listed, if there is no change in the state, the KPI Groups are not listed again. KPI Groups that stabilize and go back to anomaly state are relisted. Exceptions are listed every time they occur and are indicated by a red dot. |
2 | Time of the anomalies Click the time to see the time reflected on the timeline, in the graphs, and log. The selected time is indicated on the Event Log by a box:
|
3 | KPI Groups stabilizing If the KPI Groups have stabilized after being in an anomaly state, they are listed here. |
To see the timeline
The timeline shows the performance of each KPI Group over time during the event.
Item | Description |
---|---|
1 | Timeline filter By default, only KPI Groups that are in anomaly state are displayed. Click the filter and select Anomaly & Stable state to display KPI Groups that were in an anomaly state during this event, but are stable at the selected time. |
2 | Names of the KPI Groups and Exceptions |
3 | Indicates the current trend for each KPI Group
|
4 | Timeline showing performance for the KPI Groups and Exception KPIs
Important Click Legend |
5 | Slider to see values at a specific time on the timeline You can also use the < arrow to view what happened earlier in the event. Use the > arrow to move to a later hour. |
To see the graphs
Important
This functionality is available only if you have the Docker container installed. For more information, see Installing-BMC-AMI-Ops-Insight .
- To see graphs for all KPI Groups that are currently in anomaly, click Switch to Graph View.
- To view the graphs in full screen mode, click
. To view it in normal mode, click
.
Graph view displays individual graphs per KPI Group for all KPI Groups that are in anomaly state at the selected time, with a breakdown to the individual KPIs. If you select a different time in the Event Log, the graphs change according to the KPI Groups that were in anomaly state at that time.
By default, the graphs show the KPI values. To switch to Z-Scores, select the KPI Z-Score option. To switch back, select KPI Value. - To see KPI values on a graph for a specific time, hover over the graph.
- To see a graph for a single KPI, click the KPI name below the graph.
- To see a graph for multiple KPIs within a KPI Group, press Ctrl or Shift to select multiple KPIs.
- To zoom in a smaller time range, drag your cursor across one of the graphs to select the range. All of the graphs zoom in on the time range you selected.
- To see an expanded view of a single graph, click the list next to the graph title and select View.
A more detailed view of the graph is displayed: - To return to the view of all of the KPI Group graphs, click the back arrow or press Esc.
For z/OS events, the graphs include a filter for WLM importance levels.
By default, all WLM levels are shown. Click the All button, to open the filter and select individual WLM levels.
To return to the previous page
To return to the previous page from Timeline page, click on the Back to pageName link on the top-left of the current page.
Important
The Back to pageName link is not available on drill-down pages.