Preparing for zero-downtime upgrade of the platform
Zero downtime (ZDT) means that you do not have to schedule downtime for your production system while performing an in-place platform upgrade (Remedy AR System and BMC Atrium Core). The production work can continue while the upgrade processes in the background. If you are performing a zero-downtime upgrade of the platform in a server group, review this topic for prerequisites and perform the preconfiguration tasks before you upgrade.
From 9.1.04 onward, BMC recommends you to upgrade the platform components by using the zero-downtime upgrade method. If you wish to, you can upgrade the platform components during an outage window. For instructions, see Preparing the AR System server for a server group upgrade.
This topic provides the following details:
Zero-downtime upgrade process of Remedy platform and rollback
The following graphic shows the steps involved in the zero-downtime upgrade process of the Remedy platform components. If the upgrade process fails for any of the platform components, the platform components are automatically rolled back to the older version. Before restarting the upgrade process, you must fix the issues that caused failure.
The following video demonstrates the enhancements introduced in version 9.1.04 relevant to zero-downtime upgrade:
Zero downtime upgrade support, preconfiguration tasks, and rollback
The following table shows the base versions of Remedy from which you can perform the zero-downtime platform upgrade:
|Version||Zero-downtime upgrade supported?||Rollback supported?||Preconfiguration tasks to be performed||Additional details|
|9.x - 9.1.04||Yes||Yes||None||
If the upgrade fails for any of the platform components, all the platform components (AR, CMDB, and AI) are rolled back to the earlier version automatically. For example, if you are upgrading from 9.0.00 to 9.1.04, if the AI upgrade fails, both CMDB and AR are rolled back to 9.0.00. The file system is also rolled back to the earlier version. However, the database remains upgraded even if the upgrade fails. Because the database is upgraded, some of the forms might display additional fields introduced in 9.1.04 but these fields are not functional.
If automatic rollback fails or if you cancel the upgrade process after clicking Install, run the rollback utility manually to revert the platform components to the earlier version. For information about using the rollback utility, see Troubleshooting installer failure during upgrade.
|Earlier than 9.x (8.x and 7.6.04)||Yes||No||
Beginning with version 9.1.04, for both in-place and staged upgrades, you must upgrade the platform components on all the servers of a server group and then upgrade the application components.
Zero-downtime upgrades apply only to in-place upgrades where the following prerequisites are met:
- You have a server group environment. Without a server group, zero downtime is not possible because there is always a need to restart the executable.
- You have configured Full Text Search (FTS) for High Availability (HA) so that the search requests are completed even when a server in the group becomes unavailable. If you have not configured FTS for High Availability, search requests might not be completed during the upgrade. For instructions, see Enabling FTS high availability.
- Your reconciliation package remains on Dev and QA servers in the form of a Deployment Management package (D2P Package) so that you are aware of what is needed for a ZDT upgrade. After upgrading the primary server, apply the D2P package in the Production environment to port the reconciliations.
- You have load-balanced your environment.
You must route all the accesses by users and automated programs into the system through a load balancer so that the load can be redirected when each server is upgraded. If you do not, the user or program that is directly targeting a single server will lose access when the server is upgraded.
For automated feeds, if you want to maintain the operation, you can set up a load balancer to route all traffic to a specific server (for example, an integration server) and only redirect when that server is not available. This approach still isolates the load, but has a backup strategy if you lose that server.
You can route interactive traffic separately from automated traffic as normal. There must be an ability to failover or you cannot have zero downtime.
For further information about using a load balancer in a Remedy AR System deployment, see Configuring a hardware load balancer with BMC Remedy AR System.
- While performing a zero-downtime upgrade, remove the current server that you want to upgrade from the load balancer. To avoid user traffic to that server during the upgrade.
Limitations and cautions
Read through the following considerations before performing the ZDT upgrade.
- If the ZDT upgrade is in progress for a server, the integrations configured for the server might not work. The integrations will work after the upgrade.
- Backup is not created for the database, bundle cache, and log folders during ZDT. A backup is created for the platform components and file systems.
- If the ZDT upgrade is in progress for a server, the server is not available. Because the server is not available, you might experience slow performance of services.
- You can optionally add a new server in server group before starting the ZDT upgrade to ensure the unavailability of one server is compensated. After the upgrade is complete, this additional server can be decommissioned.
- BMC also recommends that you perform the upgrade during off-peak hours.
- If the upgrade fails, the platform component that is being upgraded and the file system are rolled back to the earlier version. For example, if you are upgrading from 9.0.00 to 9.1.04, and if the AI upgrade fails, both CMDB and AR are rolled back to 9.0.00. The file system is also rolled back to the earlier version. However, the database remains upgraded even if the upgrade fails. Because the database is upgraded, some of the forms might display additional fields introduced in 9.1.04 but these fields are not functional.
- Ensure that you have applied the latest patches and hotfixes on your base version. For example, if you are upgrading from 9.0.00 to 9.1.04, apply the relevant patches and hotfixes on 9.0.00.
Activities to be avoided or should not be performed during the zero-downtime upgrade
Activities not to be performed
Activities to be avoided if possible
Zero-downtime upgrade pre configuration tasks
If you are upgrading from versions older than 9.x, you must perform the steps listed in this section.
Preparing the AR System servers for zero-downtime upgrade (versions older than 9.x)
Perform the following steps to configure the thread queue size on secondary servers. For example, if for a fast thread of 390620, the minimum queue size is 3 and the maximum is 5, set the minimum to 5.
From AR System Administration, open the Server Information form, and click the Ports and Queues tab.
In the Server Queue section, update the Min Threads entries to match the Max Threads entries.
If you created private queues and threads to provide dedicated access to specific applications or users, update them.
For example, Plugin-ARDBC-Threads, Plugin-AREA-Threads, or Plugin-Filter-API-Threads.
Ensure that no new threads are created on the secondary server(s). As part of the go-live stage, you must revert all the changes, excluding steps 1 and 2 in this procedure.
Preparing the mid tier for zero-downtime upgrade (versions older than 9.x)
If you are upgrading to 9.1.04 from 8.x, 7.6.04 or earlier versions, you must perform the steps listed in this section.
Perform the following steps to ensure that Remedy Mid Tier has the entire cache preloaded until the upgrade is complete.
Open the Mid Tier Configuration tool.
Click AR Server Settings, and perform the following steps:
In the DELETE/EDIT column, check the server for which you want to update settings, and then click Edit.
Select the Pre-load check box to pre load the cache.
Click Cache Settings, and perform the following steps:
Clear the Perform check check box.
Set the Resource Check Interval (Seconds) to your upgrade duration.
If the number of servers is 4 and the estimated time for upgrade each server is 20 minutes (=1200 seconds), set the interval to at least double the upgrade time for all servers (1200 x 4 x 2 = 9600 seconds).
Ensure that the Enable Cache Persistence check box is selected.
In the table, click Sync Cache and wait for the action to be completed.
Performing zero-downtime upgrade in a non server group environment
When you perform a zero-downtime upgrade in a non server group environment, the installer automatically adds all of the upgraded servers to the server group. If there are servers that were not upgraded to 9.1.04 but are using the 9.1.04 database, those servers will not be added to a server group.
Beginning with 9.1.04, multiple AR System Servers cannot share the same database without being part of a server group.
To share a database among multiple AR System Servers in a server group environment, you must enable the cache replication ports (40001 and 40002) among all servers.
Where to go from here
Go to Upgrading the platform.
Back to process
When you have finished upgrading the platform (Remedy AR System and BMC Atrium Core), return to the Remedy ITSM Suite in-place upgrade process.
The upgrade process indicates when to upgrade secondary servers. As part of that process, you will complete the zero-downtime upgrade.