High availability

This topic provides recommendations for both planning and deployment of a highly available BMC Database Automation Manager. The solution presented is general and not tied to a specific clustering technology or product.

Overview of solution

HA Overview (BMC Database Automation)

The goal of the previously illustrated configuration is to provide maximum service availability in the event of any isolated failures (e.g. software, OS, physical, virtual guest) on one of the two available nodes. This configuration does not provide storage redundancy due to the expectation that this has already been architected at the SAN level. Consequently, any datacenter-impacting outages prohibit this configuration from continuing to service user requests. See Disaster-recovery if you need guidance for this particular scenario.

The shared storage (shown previously) should be mounted only on the active node, and nearly all dynamic- and application-related data are stored here. There are specific configuration files described in the Prerequisites to installationsection below that will not be shared and, therefore, must be maintained/kept in sync independently.

BMC Database Automation's management interface is accessed using a virtual IP address (usually configured via vendor clusterware), but end users usually connect using the host name that corresponds to it. Existing sessions to the BMC Database Automation Manager will be lost during failover, and the user must refresh or re-authenticate when that happens. Estimated duration of complete failover or switchover should not exceed five minutes. Exceptions to the typical failover time are likely because of the variability in the solution stack components, such as clusterware, OS, system specification.

BMC Database Automation components

The following components are related to high availability in BMC Database Automation.

postgresql

The postgreql is the underlying transactional database required for installation of the core BMC Database Automation solution. The postgresql schema hosts data pertaining to job history, patch metadata, node information, and user account data, among other things. In short, it is essential that this component be made highly available for a successful switchover/failover. Due to the dynamic nature of the data, it must be located on the shared storage component diagrammed above. By default RHEL will always place the data and transaction log files on the /var partition, which generally will not be feasible if the storage needs to be shared as it is in this architecture. The Prerequisites to installationsection below illustrates the way to relocate the postgres database to the administrator's desired location.

The postgresql should be installed on both nodes. You should review documentation for your cluster solution to see if the services should or should not be configured for auto-start.

dmanager

This is the product's network daemon which provides persistent connectivity to all Agents pointed to this particular Manager (using the Agent configuration file). It is also used for encrypted transmission of all workflows built in the middle tier (see below for description). It is critical that this service be available for any operational activities to take place, and it can be monitored in several ways:

PID file availability - /var/run/dmanager.pid
Port monitoring - TCP port 7003
System process table (dmanager)

The dmanager should be installed only on the active node. During product upgrade, it must be located on the original active (NodeA) in order to ensure the correct RPM database is used. You should review documentation for your cluster solution to see if the services should or should not be configured for auto-start.

mtd

The business logic tier of the BMC Database Automation product; the mtd is also known as the Middle Tier Daemon, where workflows are built into something that the target system's Agent (dagent) can understand. This can be monitored in a host of different ways, a few of which are specified below:

PID file availability - /var/run/mtd.pid
System process table (mtd)

The Middle Tier Daemon should be installed only on the active node. During product upgrade it must be located on the original active (NodeA) in order to ensure the correct RPM database is used. You should review documentation for your cluster solution to see if the services should or should not be configured for auto-start.

httpd

Vendor provided (RHEL) Apache packages are known as httpd. These packages should be installed on both nodes. The default BMC Database Automation product installer creates some httpd configuration files, along with depositing Apache modules into the default location in /etc/httpd/modules. See Prerequisites to installationsection for details on which files are needed and where they should be placed. Note: this is only required for the passive node, as these files will be created/copied on the primary node when management software is installed. For additional details about the installation process, see the Installing section.

Patch packages

Patch packages are largely composed of vendor supplied patches and BMC Database Automation-supplied metadata. All patch packages are created using the GUI interface and are stored in a predetermined location inside of the software installation hierarchy. This means that by default they end up on the shared storage component described previously, and it is important that they remain there for a successful failover or switchover event to Node B. BMC Database Automation's Postgres database stores metadata about the on-disk patch packages, and in order for these references to be accurate, the patch content must be the same on both nodes (accomplished by sharing the storage). This applies to all supported DBMS platforms:

Sybase
Oracle
MSSQL
DB2

/app/clarity/dmanager/var/pkgs

MSSQL media

If you plan to deploy SQL 2000, 2005, or 2008, the installer media must be stored underneath the product installation directory on the Management Server. Unlike the patch packages (above), the MSSQL media is not mapped to any Postgres metadata, but BMC strongly recommended that it be hosted in the default shared storage location.

/app/clarity/var/media

OPatch media

Many Oracle patches require a minimum version of OPatch (provided patch installation tool) to be present. BMC Database Automation has the option to push this from the Manager as part of a patch workflow to ease the management burden of applying Oracle updates. Once again, this is located inside of the installation directory and it should remain there so it can be failed or switched over.

/app/clarity/oracle_media

Templates

All out-of-the-box product workflows can be customized using XML templates. These templates are built inside of the BMC Database Automation GUI and are always stored inside the product's installation directory by default. In order to maintain equivalent product functionality on Node B, templates should remain in the default directory so they can reside on shared storage.

/app/clarity/var/templates

Actions

Actions can be described as custom automation run independent of the built-in workflows. To avoid being redundant, the same rules followed for templates should apply to Actions as well.

/app/clarity/dmanager/var/actions

Data warehouse

This optional component should be made highly available using Oracle HA technologies (Dataguard / RAC). See Oracle documentation for additional details.

Resources diagram

Clustered Resources (BMC Database Automation)

Prerequisites to installation

The following prerequisites should be followed for installing components for high availability.

postgresql

Active/Passive Nodes:

After you have planned for deployment, you must configure postgresql to initialize the data directories in the desired location on the dedicated shared storage.

Open the /etc/sysconfig/pgsql/postgresql file. Create a new one if it doesn't exist.
Add the following line:

PGDATA=/new/pgsql/data/directory

When Postgres is started for the first time it creates all the necessary data and log subdirectories in the new location.

User/group configuration

Passive (Node B) only:

There are two steps to make the Passive node look like the active in this context:

Create a 'clarity' user account with the exact same UID and GID used by the Active node.
Add the 'clarity' user account to the existing 'apache' group.

httpd

Passive (Node B) only:

There are a few post-configuration steps for Apache after the vendor (RHEL) supplied packages have been installed on the passive node.

1. Copy the following Apache modules from Node A to Node B (source and destination directory should be the same):

File name	Description
/etc/httpd/modules/libphp5.so	Custom distribution of Apache PHP Module.
/etc/httpd/modules/mod_auth_gridapp.so	BMC Database Automation authentication module.

2. Copy the following Apache configuration files from Node A to Node B. Be sure to make backups of any existing configuration files:

File name	Description
/etc/httpd/conf.d/php.conf	Configuration for custom PHP installation.
/etc/httpd/conf.d/gridapp.conf	Additional BMC Database Automation specific configuration stanzas for Apache.

Supported configurations

The following configurations are supported:

Supported	Unsupported
2Node Active/Passive	Active/Active, Active/Passive > 2 Nodes
RHEL 4 (x86, x86_64) RHEL 5 (x86, x86_64) RHEL 6 (x86, x86_64)	n/a
Shared binaries (configuration illustrated above)	Local BMC Database Automation binaries. This will most likely work, but is untested.

Testing failover

BMC recommends that the Cluster Administrator issue the commands necessary to perform a 'switchover' activity and a 'switch back' to validate the solution. Validation should include but is not limited to the following:

Validation of the cluster resources and cluster configuration.
Testing of the storage level resources (Does mount succeed? Do file permissions map to existing UID/GID on passive node?)
Verification of daemon resources. Check logs and ensure all correct processes are running.
GUI level validation to be completed last. Can the end-user connect via the VIP? Are Agents coming back online in a timely fashion? This should not take more than a few minutes as discussed in solution overview above.

Back to primary:

The procedure should be exactly the same, only in the opposite direction, and validation will also be the same.

Upgrade considerations

All upgrade activities must be conducted from the originally designated active node (Node A above). If this isn't executed correctly, the RPM database will not reflect that the software has already been installed and unexpected results will occur.

Multi-Manager environments

In a Multi-Manager environment, agents can be configured to fail over to alternate Satellite Managers. For more information, see High Availability environments.

High availability

Overview of solution

BMC Database Automation components

postgresql

dmanager

mtd

httpd

Patch packages

MSSQL media

OPatch media

Templates

Actions

Data warehouse

Resources diagram

Prerequisites to installation

postgresql

User/group configuration

httpd

Supported configurations

Testing failover

Upgrade considerations

Multi-Manager environments

On this page