High availability

This topic provides recommendations for both planning and deployment of a highly available BMC Database Automation Manager. The solution presented is general and not tied to a specific clustering technology or product.

Overview of solution

Unknown macro: {center}

HA Overview (BMC Database Automation)

The goal of the above configuration is to provide maximum service availability in the event of any isolated failures (e.g. software, OS, physical, virtual guest) on one of the two available nodes. This configuration does not provide storage redundancy due to the expectation that this has already been architected at the SAN level. Consequently, any datacenter-impacting outages will prohibit this configuration from continuing to service user requests. Please see Disaster recovery if you need guidance for this particular scenario.

The shared storage (shown above) should only be mounted on the active node, and nearly all dynamic/application-related data are stored here. There are specific configuration files described in the Prerequisites to installation section below that will not be shared and therefore must be maintained/kept in sync independently. This however is the exception and not the rule.

BMC Database Automation's management interface is accessed via a virtual IP address (usually configured via vendor clusterware), but end users will usually connect using the host name which corresponds to this. Existing sessions to the BMC Database Automation Manager will be lost during failover, and the user will need to refresh/re-authenticate when this happens. Estimated duration of complete failover/switchover should not exceed five minutes. Exceptions to this will likely be because of variability in the solution stack components such as clusterware, OS, system specification.

BMC Database Automation components

postgresql

The underlying transactional database required for installation of the core BMC Database Automation solution. The postgresql schema hosts data pertaining to job history, patch metadata, node information, and user account data, among other things. In short, it is essential that this component be made highly available for a successful switchover/failover. Due to the dynamic nature of the data, it must be located on the shared storage component diagrammed above. By default RHEL will always place the data and transaction log files on the /var partition, which generally will not be feasible if the storage needs to be shared as it is in this architecture. The Prerequisites to installation section below illustrates the way to relocate the postgres database to the administrator's desired location.

Should be installed on both nodes. You should review documentation for your cluster solution to see if the services should or should not be configured for auto-start.

dmanager

This is the product's network daemon which provides persistent connectivity to all Agents pointed to this particular Manager (via the Agent configuration file). It is also used for encrypted transmission of all workflows built in the middle tier (see below for description). It is absolutely critical that this service be available for any operational activities to take place, and it can be monitored in several ways:

PID file availability - /var/run/dmanager.pid
Port monitoring - TCP port 7003
System process table (dmanager)

Only installed on the active node. During product upgrade this must be located on the original active (NodeA) in order to ensure the correct RPM database is used. You should review documentation for your cluster solution to see if the services should or should not be configured for auto-start.

mtd

The business logic tier of the BMC Database Automation product; also known as the Middle Tier Daemon, where workflows are built into something that the target system's Agent (dagent) can understand. This can be monitored in a host of different ways, a few of which are specified below:

PID file availability - /var/run/mtd.pid
System process table (mtd)

Only installed on the active node. During product upgrade this must be located on the original active (NodeA) in order to ensure the correct RPM database is used. You should review documentation for your cluster solution to see if the services should or should not be configured for auto-start.

httpd

Vendor provided (RHEL) Apache packages. This should be installed on both nodes. The default BMC Database Automation product installer will create some httpd configuration files along with depositing Apache modules into the default location in /etc/httpd/modules. See Prerequisites to installation section for details on which files are needed and where they should be placed. Note: this is only required for the passive node, as these files will be created/copied on the primary node when management software is installed. For further details on the installation process see the Installing section.

Patch packages

Patch packages are largely composed of vendor supplied patches and BMC Database Automation-supplied metadata. All patch packages are created via the GUI interface and stored in a predetermined location inside of the software installation hierarchy. This means that by default they will end up on the shared storage component described above, and it is important that they remain there if a failover/switchover event to Node B is to be successful. BMC Database Automation's Postgres database stores metadata about the on-disk patch packages, and in order for these references to be accurate, the patch content must be the same on both nodes (accomplished by sharing the storage). This applies to all supported DBMS platforms:

Sybase
Oracle
MSSQL
DB2

/app/clarity/dmanager/var/pkgs

MSSQL media

If a customer will be doing SQL 2000/2005/2008 deployments, then the installer media will be stored underneath the product installation directory on the Management Server. Unlike the patch packages (above), the MSSQL media is not mapped to any Postgres metadata, but it is still strongly recommended that it be hosted in the default shared storage location.

/app/clarity/var/media

OPatch media

Many Oracle patches require a minimum version of OPatch (provided patch installation tool) to be present. BMC Database Automation has the option to push this from the Manager as part of a patch workflow to ease the management burden of applying Oracle updates. Once again, this is located inside of the installation directory and it should remain there so it can be failed/switched over.

/app/clarity/oracle_media

Templates

All out-of-the-box product workflows can be customized via XML templates. These templates are built inside of the BMC Database Automation GUI and will always be stored inside the product's installation directory by default. In order to maintain equivalent product functionality on Node B, templates should remain in the default directory so they can reside on shared storage.

/app/clarity/var/templates

Actions

Actions can be described as custom automation run independent of the built-in workflows. To avoid being redundant, the same rules followed for Templates should apply to Actions as well.

/app/clarity/dmanager/var/actions

Data warehouse

This optional component should be made highly available using Oracle HA technologies (Dataguard / RAC). See Oracle documentation for further details.

Resources diagram

Unknown macro: {center}

Clustered Resources (BMC Database Automation)

Prerequisites to installation

postgresql

Active/Passive Nodes:

After you have planned for deployment, you will need to configure postgresql to initialize the data directories in the desired location on the dedicated shared storage.

Open the /etc/sysconfig/pgsql/postgresql file. Create a new one if it doesn't exist.
Add the following line:

PGDATA=/new/pgsql/data/directory

When Postgres is started for the first time it will create all the necessary data and log subdirectories in the new location.

User/group configuration

Passive (Node B) only:

There are two steps to make the Passive node look like the active in this context:

Create a 'clarity' user account with the exact same UID and GID used by the Active node.
Add the 'clarity' user account to the existing 'apache' group.

httpd

Passive (Node B) only:

There are a few post-configuration steps for Apache after the vendor (RHEL) supplied packages have been installed on the passive node.

1. Copy the following Apache modules from Node A to Node B (source and destination directory should be the same):

File name	Description
/etc/httpd/modules/libphp5.so	Custom distribution of Apache PHP Module.
/etc/httpd/modules/mod_auth_gridapp.so	BMC Database Automation authentication module.

2. Copy the following Apache configuration files from Node A to Node B. Be sure to make backups of any existing configuration files:

File name	Description
/etc/httpd/conf.d/php.conf	Configuration for custom PHP installation.
/etc/httpd/conf.d/gridapp.conf	Additional BMC Database Automation specific configuration stanzas for Apache.

Supported configurations

Supported	Unsupported
2Node Active/Passive	Active/Active, Active/Passive > 2 Nodes
RHEL 4/5 (x86, x86_64)	n/a
Shared binaries (configuration illustrated above)	Local BMC Database Automation binaries. This will most likely work, but is untested.

Testing failover

BMC recommends that the Cluster Administrator issue the commands necessary to perform a 'switchover' activity and a 'switch back' to validate the solution. Validation should include but is not limited to the following:

Validation of the cluster resources and cluster configuration.
Testing of the storage level resources (Does mount succeed? Do file permissions map to existing UID/GID on passive node?)
Verification of daemon resources. Check logs and ensure all correct processes are running.
GUI level validation to be completed last. Can the end-user connect via the VIP? Are Agents coming back online in a timely fashion? This should not take more than a few minutes as discussed in solution overview above.

Back to primary:

The procedure should be exactly the same, only in the opposite direction, and validation will also be the same.

Upgrade considerations

All upgrade activities must be conducted from the originally designated active node (Node A above). If this isn't executed correctly, the RPM database will not reflect that the software has already been installed and unexpected results will ensue.

Page tree

High availability

Overview of solution

BMC Database Automation components

postgresql

dmanager

mtd

httpd

Patch packages

MSSQL media

OPatch media

Templates

Actions

Data warehouse

Resources diagram

Prerequisites to installation

postgresql

User/group configuration

httpd

Supported configurations

Testing failover

Upgrade considerations