Infrastructure Management Server high-availability architecture
A high-availability deployment of Infrastructure Management consists of two servers with identical configuration, one designated as primary and the other as secondary. At any point, one of the nodes is active and the other in standby mode. If the active node shuts down or otherwise becomes unavailable, the standby node takes over the active role. When you restore the primary server, failback occurs, and it becomes the active node.
The Infrastructure Management HA components automatically manage the synchronization between the active and standby nodes and automatically detect a failover situation. You need to configure a third-party load balancer in reverse proxy mode between the TrueSight Presentation Server and HA-enabled Infrastructure Management server.
An Infrastructure Management HA deployment comprises the following systems:
- Primary server
- Secondary server
- Third-party load balancer
HA deployment for Infrastructure Management
In an HA deployment of Infrastructure Management, the load balancer is installed on a separate server and redirects requests to the active node. The load balancer provides a single point of access to the HA-enabled Infrastructure Management server.
Failover and failback
Failover occurs when the primary server becomes unavailable or when any of the critical server processes become unavailable. The failover from the active node to the standby node could take up to 6 minutes.
There are two types of failback - automatic and manual. In automatic failback, which is the default behavior, the primary server becomes the active node upon server startup. For manual failback, you need to perform the following steps:
(Primary server only) Set the following property to
falsein the installedDirectory\pw\custom\conf\ha.conf file:
- (Windows only) Change the startup type of the BMC TrueSight Infrastructure Management server service from Automatic to Manual.
During failover and failback, the Infrastructure Management server might get disconnected from the Presentation Server, until the standby node becomes active. During this time, you might not be able to perform any operations on the Infrastructure Management server.
The following diagrams illustrate the failover and failback (automatic and manual) behavior.
Failover and automatic failback
Failover and manual failback