Architecture


PATROL for Agent Failover empowers BMC Helix Operations Management customers by allowing them to assign data collection responsibilities to a pair of PATROL Agents. If the primary PATROL Agent (Primary) becomes unreachable, the partnered PATROL Agent (Secondary) is automatically configured to connect to BMC Helix Operations Management and resume the collection duties of the failed agent. Upon the Primary Agent regaining connectivity, it resumes its duties, and the Secondary Agent disconnects from BMC Helix Operations Management.

This Knowledge Module (KM) is designed to run on a PATROL Agent outside of the identified failover pairs. This setup enables a single PATROL for Failover agent to efficiently support any number of PATROL Agent pairs.

This Monitoring Agent can be paired with another agent, providing monitor-the-monitor redundancy if required.

A typical architecture for PATROL for Agent Failover is as follows:

HA_Architecture.png

How PATROL for Agent Failover works

Step 1:

To begin, configure the host for the PATROL for Agent Failover knowledge module (KM) as a standard Agent, connected to BMC Helix Operations Management. Deploy the PATROL for Failover knowledge module to this agent.

Step 2:

Both the Primary and Secondary PATROL Agents (HA pair) must have proper connectivity with BMC Helix Operations Management. Configure a virtual host name on both the primary and secondary PATROL Agents, so that the BMC Helix Operations Management displays only a single Agent entry with the virtual name.

Next, install and configure the PATROL Agents intended to function as a Failover Pair. Ensure that both instances are configured to utilize an ENVIRONMENT Variable to control their "hostname".

  • On Windows, setting the environment variable is performed in the "System" area.
  • On Linux/UNIX systems, create a "userrc.sh" file in the "Patrol3" directory to set the environment variable. The OOTB PatrolAgent startup is designed to search for the "userrc.sh" file and read its content into the PATROL Agent environment. Multiple Agents can be registered in the file.

Example:

PATROL_VIRTUALNAME_3182= PingAgent_Houston
export PATROL_VIRTUALNAME_3182
Warning

Note

This solution does not function as intended using the "publishHost" variable assignment. Both agents must utilize the same Agent Namesetup in the environment variable mentioned above.

Ensure that the PATROL Agents have the required Knowledge Modules installed to support the desired data collection (e.g., the LWP km if used to collect "ping" data).

Initially, the Pair of PATROL Agents should NOT be configured to use any existing Policy. The "Primary" Agent is initially connected to the BMC Helix Operations Management infrastructure, while the "Secondary" Agent remains disconnected from BMC Helix Operations Management infrastructure. The FailOver Agent will manage when and how the Secondary is connected as needed. To ensure this behavior, servers on which the collecting Agent pair is installed should have a NULL value in the integrationservice.current file. This ensures that these Agents will not both connect to BMC Helix Operations Management infrastructure simultaneously, should they be purged.

Warning

Note

The Agent pair must be configured to monitor those Knowledge Modules that are shared and identified for monitoring by the Failover monitoring Agent.

Before proceeding to step 3:

  • Confirm that the Primary Agent is connected to BMC Helix Operations Management infrastructure and is visible in Manage Devices using the Alias Name.
  • Ensure the Secondary Agent is running but NOT connected to BMC Helix Operations Management infrastructure.

Step 3: Configure the PATROL for Failover Agent using Policy configuration in the BMC Helix Operations Management infrastructure UI. Configure the Policy in a manner similar to the description provided on the following Pages:

Warning

Note

  • Assign a "high" precedence to the Policy.
  • Implement highly Agent-specific "Agent Selection Criteria."

 

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*

BMC PATROL for Agent High Availability 23.4