High availability
This topic provides recommendations for both planning and deployment of a highly available BMC Database Automation (BDA) Manager. The solution, known as High Availability (or HA), does not describe the automated solution using a specific clustering technology or product. For detailed instructions using a clustering solution, see Clustering the BDA manager service to automate High Availability.
Overview of HA solution
Disaster recovery provides maximum service availability in the event of isolated failures (for example, software or OS failures) on one of the two HA nodes. Consequently, any data center impacting outages prohibit this configuration from continuing to service user requests. For more information, see Disaster Recovery.
The shared storage, which is mounted on the active node, stores all dynamic- and application-related data. Specific configuration files must be maintained independently. For more information, see Requirements of a passive node.
The BDA management interface is accessed using a virtual IP address. Existing sessions to the BDA Manager are lost during failover; therefore, you must refresh or re-authenticate. The estimated duration of complete failover or switchover should not exceed five minutes. Exceptions to the typical failover time are likely because of the variability in the solution stack components, such as clusterware, OS, and system specifications.
BMC Database Automation components
The following components are related to HA in BDA.
postgresql
The postgreql is the underlying transactional database required for installation of the core BDA solution. The postgresql schema hosts data pertaining to job history, patch metadata, node information, and user accounts. It is an essential component for a successful switchover or failover. The postgresql must be installed on both nodes. By default, RHEL will always place the data and transaction log files on the /var partition, which is not feasible to share. The Requirements of a passive node section illustrates how to relocate the postgresql database to the administrator's desired location. Refer to the documentation for your cluster solution to see whether the services are configured for auto-start.
dmanager
The damanager is a network daemon of the BDA solution, which provides persistent connectivity to all agents pointed to a particular manager. It is also used for encrypted transmission of all workflows built in the middle tier. It is critical that this service be available for any operational activities to take place, and it can be monitored in following ways:
- PID file availability: /var/run/dmanager.pid
- Port monitoring: TCP port 7003
- System process table (dmanager)
The dmanager must be installed only on the active node. During the product upgrade, it must be located on the original active (Node A) in order to ensure that the correct RPM database is used. Refer to the documentation for your cluster solution to see whether the services are configured for auto-start.
For information about the dmanager configuration file, see dmanager.conf.
mtd
The Middle Tier Daemon (mtd) is a business logic tier of the BDA product. Workflows are built into something that the target system's Agent (dagent) can understand. This can be monitored in a host by different ways such as:
- PID file availability: /var/run/mtd.pid
- System process table (mtd)
The mtd must be installed only on the active node. During the product upgrade, it must be located on the original active (Node A) in order to ensure that the correct RPM database is used. Refer to the documentation for your cluster solution to see whether the services are configured for auto-start.
httpd
The httpd is a vendor provided (RHEL) Apache package. This package should be installed on both nodes. The default BDA product installer creates httpd configuration files and deposits Apache modules into the default location at /etc/httpd/modules. For more information, see Requirements of a passive node.
The httpd is required only for the passive node because these files are created or copied on the primary node when the management software is installed. For more information about the installation process, see the Installing section.
Patch packages
Patch packages are largely composed of vendor supplied patches and BDA-supplied metadata. All patch packages are created using the GUI interface and are stored at a predetermined location inside the software installation hierarchy. This means that, by default, the packages end up on the shared storage component, and it is important that they remain there for a successful failover or switchover event to Node B. The Postgresql database stores metadata about the on-disk patch packages, and in order for these references to be accurate, the patch content must be the same on both nodes (you do this by sharing the storage). This applies to all supported DBMS platforms:
- Sybase
- Oracle
- MSSQL
- DB2
/app/clarity/dmanager/var/pkgs
MSSQL media
If you plan to deploy Microsoft SQL Server, the installer media must be stored under the product installation directory on the Management Server. Unlike the patch packages, the MSSQL media is not mapped to any Postgresql metadata. BMC Software strongly recommends that MSSQL media is hosted at the default shared storage location.
/app/clarity/var/media
OPatch media
Many Oracle patches require a minimum version of OPatch (the provided patch installation tool) to be available. BDA has the option to push this from the Manager as part of a patch workflow to ease the management burden of applying Oracle updates. Once again, this is located inside the installation directory and it should remain there so it can be failed to or switched over.
/app/clarity/oracle_media
Templates
All out-of-the-box product workflows can be customized using XML templates. These templates are built in the BDA GUI and, by default, are stored inside the installation directory of the product. To maintain equivalent functionality on Node B, templates should remain in the default directory so they reside on a shared storage.
/app/clarity/var/templates
Actions
Actions are custom automations that run independently from built-in workflows. These Actions are built in the BDA GUI and, by default, are stored inside the installation directory of the product.
/app/clarity/dmanager/var/actions
Data warehouse
The data warehouse is an optional component that should be made highly available using Oracle HA technologies (Dataguard or RAC). Refer to Oracle documentation for additional details.
Resources diagram
Prerequisites to installation
- All nodes must have the same OS and update levels installed.
- Shared storage must be available on all nodes.
- OS configuration parameters, firewall, selinux, system settings, or third party software must configured the same way on all nodes.
Installation Instructions:
postgresql
Active and Passive Nodes:
For deployment, you must configure postgresql to initialize data directories at the desired location on the dedicated shared storage.
- Open the /etc/sysconfig/pgsql/postgresql file. Create a new one if it does not exist.
- Add the following path:
PGDATA=/new/pgsql/data/directory
When Postgresql is started for the first time it creates all the necessary data and logs subdirectories at the new location.
User/group configuration
Passive (Node B) only:
To make the Passive node look like the active node:
- Create a clarity user account with the same UID and GID used by the Active node.
- Add the clarity user account to the existing apache group.
httpd
Passive (Node B) only:
The following post-configuration steps are required for Apache after the vendor (RHEL) supplied packages have been installed on the passive node.
1. Copy the following Apache modules from Node A to Node B (the source and destination directory must be the same).
File name | Description |
---|---|
/etc/httpd/modules/libphp5.so | Custom distribution of Apache PHP Module |
/etc/httpd/modules/mod_auth_gridapp.so | BDA authentication module |
2. Copy the following Apache configuration files from Node A to Node B. Make sure you take backups of existing configuration files.
File name | Description |
---|---|
/etc/httpd/conf.d/php.conf | Configuration for custom PHP installation |
/etc/httpd/conf.d/gridapp.conf | Additional BDA specific configuration stanzas for Apache. |
Supported configurations
The following configurations are supported. To learn more about the supported configurations see OS requirements.
Supported | Unsupported |
---|---|
2Node Active/Passive | Active/Active, Active/Passive > 2 Nodes |
RHEL 5 (x86_64) | N/A |
Shared binaries (configuration illustrated above) | Local BDA binaries |
Testing failover
BMC Software recommends that the Cluster Administrator simulates the failure of the active node and issues commands necessary to perform a 'switchover' and a 'switch back' to validate the solution. Validation should include the following steps:
- Shut down the active node using normal or forced Power Off.
- Mount all file systems on the passive node.
- Configure the virtual IP address.
- Start the database and BDA services.
- Validate if BDA works correctly using the GUI interface.
Validation should always include following tests to insure that switchover will be successful:
- Validate the HA installation resources and configuration.
- Test the storage level resources. Does mount succeed? Do file permissions map to existing UID/GID on passive node?
- Verify the daemon resources. Check the logs and ensure that all appropriate processes are running.
- Complete the GUI level validation in the end. Can the end-user connect via the VIP? Are Agents coming back online in a timely fashion? This should not take more than a few minutes, as discussed in Overview of HA Solution.
Back to primary
The procedure is identical, except reverse the steps and perform the same validations.
Clustering the BDA manager service to automate HA
This section describes how to use the VCS clustering service using VCS hosted in different datacenters. Clustering helps to migrate data from active to passive node automatically.
Prerequisites for automation
The following prerequisites should be followed to automate HA:
- All datacenters must have the same VLAN and must provide only one external IP for the BDA manager service.
- For sharing of storage between all nodes, BMC recommends that you have separate storage for each datacenter and some storage level replication technology (continuous access or analog). Or you can have a shared cluster file system.
- VCS should be configured to avoid split-brain. BMC recommends I/O fencing.
Running HA in a clustered environment
- Provision all cluster nodes, and install the required software and the Veritas cluster suite (including the Veritas Volume manager).
- Present a shared lun to each node.
- Initialize the disk:
#vxdiskadd aluadisk0_0
- Create a diskgroup
:
#vxdg init dg_bmc_bda disk1=aluadisk0_0 - Create the volume
:
#vxassist -g dg_bmc_bda make vol_bmcbda 2G - Create a filesystem:
#mkfs.vxfs /devvx/dsk/dg_bmc_bda/vol_bmcbda
- Create an application directory and mount the FS to /app/:
#mkdir /app
#mount -t vxfs /dev/vx/dsk/dg_bmc_bda/vol_bmcbda /app/
- Install BDA and ensure it is running.
- Stop all BDA services (dmanager, mtd, httpd, postgresql).
- Move the database to a shared storage and link it back:
#mv /var/lib/pgsql/ /app/
#ln -s /app/pgsql/ /var/lib/pgsql
- Create init script links for external services.
#ln -s /etc/init.d/httpd /app/clarity/etc/init.d/httpd
#ln -s /etc/init.d/postgresql /app/clarity/etc/init.d/postgresql
- Save your existing configuration before you modify main.cf.
#haconf -makerw
#haconf -dump –makero
- Stop the VCS engine on all systems and leave the resources available.
#hastop -all -force
- Reconfigure main.cf as in sample file: set cluster name, IP address, hostnames, Veritas volume parameters and put it to /etc/VRTSvcs/conf/config/.
- Copy init and monitoring scripts to /opt/VRTSvcs/bin/.
- Start VCS and check if everything started:
#hastart
#hastatus
- Copy the following files to all locations where the nodes are saved:
/opt/VRTSvcs/bin/bmcbda.sh
/opt/VRTSvcs/bin/Monitor_script_bmcbda
/etc/httpd/conf.d/gridapp.conf
/etc/httpd/conf.d/php.conf
/etc/httpd/modules/libphp5.so
/etc/httpd/modules/mod_auth_gridapp.so
- Create the /app directory on all nodes.
- Remove the directory /var/lib/pgsql/ on secondary nodes and create a link:
#rm -rf /var/lib/pgsql/
#ln -s /app/pgsql/ /var/lib/pgsql
- Create a user named clarity with the main group named clarity and a secondary group named apache on all nodes.
The UID and GID must be the same as on the initial node. - Stop the cluster service on the initial node and test on all others.
#hagrp -offline -force ClusterService -sys
#hagrp -online ClusterService -sys
Upgrade considerations
All upgrade activities must be conducted from the originally designated active node (Node A). If this is not executed correctly, the RPM database will not reflect that the software has already been installed. This might lead to unexpected results. For clustered environments, the upgrade should be performed on the node where BDA was installed. After the upgrade, copy all the Apache files again to all nodes.
Multi-Manager environments
In a Multi-Manager environment, agents can be configured to fail over to alternate Satellite Managers. For more information, see High Availability environments.
Automating failover process
You can automate a failover procedure by any third party software such as Veritas Cluster Suite, HP ServiceGuard and so on.
Comments
Log in or register to comment.