Disaster recovery for Application Servers
For disaster recovery, you should maintain an Application Server and its associated TrueSight Server Automation infrastructure (the database and file server) at a secondary site. The secondary site must be distant enough from the primary site to ensure continued operation if the primary site is completely inoperable due to the loss of the physical infrastructure or the services that enable that infrastructure, such as power, the network, and so on.
In normal operation, the Application Server at the secondary site is not actively running. It is started only in scenarios in which the primary site is inoperable. In such scenarios, the standby TrueSight Server Automation database at the secondary site is activated and the standby Application Server services are started.
To ensure that all clients that are configured to use the Application Server at the primary site are serviced by the secondary site, you must make one of the following configuration updates:
- Move the virtual IP address (VIP) that connects to the primary site's Application Server to the secondary site's Application Server.
- Update the domain name service (DNS) VIP entry to point to the Application Server at the secondary site.
The benefits derived from maintaining a secondary site for disaster recovery are as follows:
- A loss of the primary site does not imply an extended downtime of Application Server services.
- For scheduled work at the primary site that would inevitably take down the primary Application Server or its supporting infrastructure (database or file server), the option is available to temporarily fail over to the secondary site and ensure continued availability of Application Server services.
The following figure shows an easily recoverable architecture for Application Servers:
Implementation
To implement an easily recoverable Application Server architecture, you perform the following actions:
- To ensure connection consistency across all Application Servers, construct an alias for the JDBC connection string and use the alias for all Application Servers. The alias should be easily changeable to point to a different database. For example, you should be able to change from the PROD database to the DR (disaster recovery) database by using a product such as 3DNS from F5 Networks, changing an entry in the /etc/hosts file, or performing a similar action.
- Similarly, construct an alias for the file server path. The alias should be easily changeable to point to the appropriate location.
- At the secondary site, install the Application Server.
- Disable database and file server replication.
- Change the aliases to the database and file server to point to the secondary site.
- Start the standby database at the secondary site.
- Start the Application Servers at the secondary site and ensure that they start properly.
- Verify Application Server functionality.
- Return to normal operation using the primary site:
- Shut down the Application Server at the secondary site.
- Synchronize the database and file server from the secondary site to the primary site.
- Change the aliases to point to the database and file server at the primary site.
- Start up services at the primary site.
Keep the secondary site's Application Server down until a failover to the secondary site is required. In that case, activate the standby database first, and then start the Application Server.