Configuring Self-Health Monitoring for data and events out of synchronization using Cell Monitoring KM



In an HA environment, both the data and events must be in synchronization on both the Primary and Secondary nodes. Occasionally, it can happen that the data and events are out of synchronization in your Remote Cell HA or Infrastructure Management HA environment. Administrators can deploy this Cell Monitoring (Custom) KM to configure the self-health monitoring policies to identify and handle this problem using the TrueSight console or Remote PATROL Agent configured on a TrueSight Infrastructure Management

Administrators can configure the monitoring policies for generating alarm events through Monitoring Solutions (KMs) when the data or event is out of synchronization in Cell HA or Infrastructure Management HA and monitor these alarm events in the TrueSight console > Monitoring Devices page.

Perform the following steps in that order to install and configure the monitoring policies for data and event out of synchronization:

Prerequisites

  • Download the Custom KM from the ftp://ftp.bmc.com/pub/PATROL_KM/CellMonitor location. You can download the Unix or Windows build based on the PATROL Agent in your environment.
  • The PATROL Agent must be integrated with the Infrastructure Management server and it must be in connected state on the TrueSight console > Configuration > Managed Devices page. For more information, see Managing Infrastructure Management devices from the TrueSight console.

  • You must have deployed the Custom KM on PATROL Agent. For more information, see Importing the Infrastructure Management repository on the Presentation Server or deploying a custom KM on PATROL Agent.

  • Both Primary and Secondary nodes of the Remote Cell HA and Infrastructure Management HA must be up and running for the out-of-synchronization data and event queries to run.

To configure the self-health monitoring policies for data and event out of synchronization

For this Custom KM, the Agent Thresholds is set and enabled by default. However, you can configure the Server Thresholds if required. The default Agent Thresholds values set are:

  • For Data out of sync: Threshold 1 is set to Min = 1 and Max = 100 
  • For Events out of sync: Threshold 1 is set to Min = 1 and Max = 100 
  • For Status: Threshold 1 is Warning and Threshold 2 is Alarm


Perform the following steps to configure the monitoring policies for generating alarm events:

  1. Create (or Edit) a monitoring policy from TrueSight console > Configuration > Infrastructure Policies. For more information, see Defining a monitoring policy.

  2. Select the Monitoring tab and click Add Monitoring Configuration.
  3. In the Monitoring Solutions field, select Cell HA Monitor.
  4. In the Cell High-availability Configuration option, click Add.
  5. Specify the cell HA configuration details as mentioned in the following table:

    Cell High-availability Configuration

    Field

    Description

    Cell HA Group Label/Monitor Instance Name

    Specify the label of the monitoring instance for the Cell HA group. Note: Use the same value when you configure the Server Thresholds > Monitor Instance Name.

    Primary Cell Host Name

    Specify the Primary Cell host name. It can be an FQDN, shortname, or IP Address.

    Primary Cell Port

    Specify the port number for the Primary Cell. The default value is 1828.

    Secondary Cell Host Name

    Specify the Secondary Cell host name. It can be an FQDN, shortname, or IP Address.

    Secondary Cell Port

    Specify the port number for the Secondary Cell. The default value is 1828.

    Cell Encryption Key

    Specify the encryption key. The default value is mc.

  6. Click OK and Close to exit the dialog box. The configuration is added as a new row in the Monitoring tab. 
  1. Select the Polling Interval tab and click Add Polling Interval.
  2. In the Monitoring Solutions field, select Cell HA Monitor.
  3. (Optional) In the Monitoring Parameter field, select Cell Data Sync Collector (selected by default).
    By default, the Polling Interval is 15 minutes.
  4. Specify the Parameter Polling Interval and click OK to stay on the dialog box.
  5. (Optional) In the Monitoring Parameter field, select Cell Event Sync Collector.
    By default, the Polling Interval is 15 minutes.
  6. Specify the Parameter Polling Interval and click OK to stay on the dialog box.
  7. (Optional) In the Monitoring Parameter field, select Status.
    By default, the Polling Interval is 1 minute.
  8. Click OK and Close to exit the dialog box. The polling interval is updated to the Cell HA configuration.

Note

Based on your requirements, you can configure:

  • Only Agent Thresholds
  • Only Server Thresholds
  • Both
  1. Click the Agent Threshold tab.
  2. Click Add Agent Threshold.
  3. In the Monitoring Solutions field, select Cell HA Monitor.
  4. In the Monitoring Attribute field, select the Data out of sync (selected by default).
  5. (Optional) Set the Threshold Ranges by selecting the Enable checkbox.
    By default, the Threshold 1 is enabled. For more information, see Defining a monitoring policy.

  6. Click OK to stay on the dialog box.
  7. In the Monitoring Attribute field, select the Events out of sync.
  8. (Optional) Set the Threshold Ranges by selecting the Enable checkbox.
    By default, the Threshold 1 is enabled. For more information, see Defining a monitoring policy.

  9. Click OK to stay on the dialog box.
  10. In the Monitoring Attribute field, select the Status.
  11. (Optional) Set the Threshold Ranges by selecting the Enable checkbox.
    By default, both the Threshold 1 and Threshold 2 are enabled. For more information, see Defining a monitoring policy.

  12. Click OK and Close to exit the dialog box. The monitor type with selected attribute and threshold options for a given instance is added or updated to the table in the Agent Threshold tab.

NoteBased on your requirements, you can configure:
  • Only Agent Thresholds
  • Only Server Thresholds
  • Both

  1. Click the Server Threshold tab.
  2. Click Add Server Threshold.
  3. In the Monitoring Solutions field, select Cell HA Monitor.
  4. Specify a unique and meaningful Monitor Instance Name.
    Note: Monitor Instance Name must be the same value that you specified in the Cell High-availability Configuration >
  5. (Optional) Select Associate with a Device if you want to.
  6. (Optional) In the Monitoring Attribute field, select the Data out of sync (selected by default).
  7. Specify the Threshold values. For more information, see Defining a monitoring policy.

  8. Click OK to stay on the dialog box.
  9. (Optional) In the Monitoring Attribute field, select the Events out of sync.
  10. Specify the Threshold values. For more information, see Defining a monitoring policy.

  11. Click OK to stay on the dialog box.
  12. (Optional) In the Monitoring Attribute field, select the Status.
  13. Specify the Threshold values. For more information, see Defining a monitoring policy.

  14. Click OK and Close to exit the dialog box. The monitor type with selected attribute and threshold options for a given instance is added or updated to the table in the Server Threshold tab.

Monitoring the data and event out of synchronization

After configuring the policy, you can verify that the monitor instance has been created (format: <monitor_instance_name>\Cell Monitor) from the TrueSight console. For example, my_monitor_instance.abc.com\Cell Monitor. However, 

You can check the newly created monitor instance from the TrueSight console > Monitoring > Devices > Device Details > Monitors tab and view the event and data out of sync for the newly created monitor instance. 

Data and event out-of-sync metrics

This is applicable only if you have deployed this Custom KM to TrueSight Operations Management version 11.3.01 or 11.3.02. On the Monitor DetailsPerformance Overview tab, you will see the following Metrics displayed for selection for both Multimetric Comparison and Single Metric Analysis:

KPI: Data out of sync(#), KPI: Event out of sync(#), and Status (0-OK, 1-Busy, 2-Connection Error)

For more information on monitoring the out-of-synchronized data and events, see Viewing device details and Viewing monitor information in the TrueSight console.

Manually resolving the data and event out of synchronization

You can manually resolve or troubleshoot the out of synchronization data and events. For more information, see Cell database out of sync.


 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*