Preparing for zero-downtime upgrade of the platform
Zero downtime (ZDT) means that you do not have to schedule downtime for your production system while performing an in-place platform upgrade (Remedy AR System and BMC Atrium Core). The production work can continue while the upgrade processes in the background. If you are performing a zero-downtime upgrade of the platform in a server group, review this topic for prerequisites and perform the preconfiguration tasks before you upgrade.
From 9.1.04 onward, BMC recommends you to upgrade the platform components by using the zero-downtime upgrade method. If you wish to, you can upgrade the platform components during an outage window. For instructions, see Preparing the AR System server for a server group upgrade.
This topic provides the following details:
Zero-downtime upgrade process of Remedy platform and rollback
The following graphic shows the steps involved in the zero-downtime upgrade process of the Remedy platform components. If the upgrade process fails for any of the platform components, the platform components are automatically rolled back to the older version. Before restarting the upgrade process, you must fix the issues that caused failure.
Video
The following video demonstrates the enhancements introduced in version 9.1.04 relevant to zero-downtime upgrade:
Zero downtime upgrade support, preconfiguration tasks, and rollback
The following table shows the base versions of Remedy from which you can perform the zero-downtime platform upgrade:
Version | Zero-downtime upgrade supported? | Rollback supported? | Preconfiguration tasks to be performed | Additional details |
---|---|---|---|---|
9.x - 9.1.04 | Yes | Yes | None | If the upgrade fails for any of the platform components, all the platform components (AR, CMDB, and AI) are rolled back to the earlier version automatically. For example, if you are upgrading from 9.0.00 to 9.1.04, if the AI upgrade fails, both CMDB and AR are rolled back to 9.0.00. The file system is also rolled back to the earlier version. However, the database remains upgraded even if the upgrade fails. Because the database is upgraded, some of the forms might display additional fields introduced in 9.1.04 but these fields are not functional. If automatic rollback fails or if you cancel the upgrade process after clicking Install, run the rollback utility manually to revert the platform components to the earlier version. For information about using the rollback utility, see Troubleshooting installer failure during upgrade. |
Earlier than 9.x (8.x and 7.6.04) | Yes | No |
|
Note
Beginning with version 9.1.04, for both in-place and staged upgrades, you must upgrade the platform components on all the servers of a server group and then upgrade the application components.
Prerequisites
Zero-downtime upgrades apply only to in-place upgrades where the following prerequisites are met:
- You have a server group environment. Without a server group, zero downtime is not possible because there is always a need to restart the executable.
- You have configured Full Text Search (FTS) for High Availability (HA) so that the search requests are completed even when a server in the group becomes unavailable. If you have not configured FTS for High Availability, search requests might not be completed during the upgrade. For instructions, see Enabling FTS high availability.
- Your reconciliation package remains on Dev and QA servers in the form of a Deployment Management package (D2P Package) so that you are aware of what is needed for a ZDT upgrade. After upgrading the primary server, apply the D2P package in the Production environment to port the reconciliations.
- You have load-balanced your environment.
You must route all the accesses by users and automated programs into the system through a load balancer so that the load can be redirected when each server is upgraded. If you do not, the user or program that is directly targeting a single server will lose access when the server is upgraded.
For automated feeds, if you want to maintain the operation, you can set up a load balancer to route all traffic to a specific server (for example, an integration server) and only redirect when that server is not available. This approach still isolates the load, but has a backup strategy if you lose that server.
You can route interactive traffic separately from automated traffic as normal. There must be an ability to failover or you cannot have zero downtime.
For further information about using a load balancer in a Remedy AR System deployment, see Configuring a hardware load balancer with BMC Remedy AR System.- While performing a zero-downtime upgrade, remove the current server that you want to upgrade from the load balancer. To avoid user traffic to that server during the upgrade.
Limitations and cautions
Read through the following considerations before performing the ZDT upgrade.
- If the ZDT upgrade is in progress for a server, the integrations configured for the server might not work. The integrations will work after the upgrade.
- Backup is not created for the database, bundle cache, and log folders during ZDT. A backup is created for the platform components and file systems.
- If the ZDT upgrade is in progress for a server, the server is not available. Because the server is not available, you might experience slow performance of services.
- You can optionally add a new server in server group before starting the ZDT upgrade to ensure the unavailability of one server is compensated. After the upgrade is complete, this additional server can be decommissioned.
- BMC also recommends that you perform the upgrade during off-peak hours.
- If the upgrade fails, the platform component that is being upgraded and the file system are rolled back to the earlier version. For example, if you are upgrading from 9.0.00 to 9.1.04, and if the AI upgrade fails, both CMDB and AR are rolled back to 9.0.00. The file system is also rolled back to the earlier version. However, the database remains upgraded even if the upgrade fails. Because the database is upgraded, some of the forms might display additional fields introduced in 9.1.04 but these fields are not functional.
- Ensure that you have applied the latest patches and hotfixes on your base version. For example, if you are upgrading from 9.0.00 to 9.1.04, apply the relevant patches and hotfixes on 9.0.00.
Activities to be avoided or should not be performed during the zero-downtime upgrade
Activities not to be performed | Activities to be avoided if possible |
---|---|
|
|
Zero-downtime upgrade pre configuration tasks
If you are upgrading from versions older than 9.x, you must perform the steps listed in this section.
Preparing the AR System servers for zero-downtime upgrade (versions older than 9.x)
Perform the following steps to configure the thread queue size on secondary servers. For example, if for a fast thread of 390620, the minimum queue size is 3 and the maximum is 5, set the minimum to 5.
From AR System Administration, open the Server Information form, and click the Ports and Queues tab.
In the Server Queue section, update the Min Threads entries to match the Max Threads entries.
If you created private queues and threads to provide dedicated access to specific applications or users, update them.
For example, Plugin-ARDBC-Threads, Plugin-AREA-Threads, or Plugin-Filter-API-Threads.Ensure that no new threads are created on the secondary server(s). As part of the go-live stage, you must revert all the changes, excluding steps 1 and 2 in this procedure.
Preparing the mid tier for zero-downtime upgrade (versions older than 9.x)
If you are upgrading to 9.1.04 from 8.x, 7.6.04 or earlier versions, you must perform the steps listed in this section.
Perform the following steps to ensure that Remedy Mid Tier has the entire cache preloaded until the upgrade is complete.
Open the Mid Tier Configuration tool.
Click AR Server Settings, and perform the following steps:
In the DELETE/EDIT column, check the server for which you want to update settings, and then click Edit.
Select the Pre-load check box to pre load the cache.
Click Cache Settings, and perform the following steps:
Clear the Perform check check box.
Set the Resource Check Interval (Seconds) to your upgrade duration.
If the number of servers is 4 and the estimated time for upgrade each server is 20 minutes (=1200 seconds), set the interval to at least double the upgrade time for all servers (1200 x 4 x 2 = 9600 seconds).
Ensure that the Enable Cache Persistence check box is selected.
In the table, click Sync Cache and wait for the action to be completed.
Performing zero-downtime upgrade in a non server group environment
When you perform a zero-downtime upgrade in a non server group environment, the installer automatically adds all of the upgraded servers to the server group. If there are servers that were not upgraded to 9.1.04 but are using the 9.1.04 database, those servers will not be added to a server group.
Beginning with 9.1.04, multiple AR System Servers cannot share the same database without being part of a server group.
To share a database among multiple AR System Servers in a server group environment, you must enable the cache replication ports (40001 and 40002) among all servers.
Where to go from here
Next task | Go to Upgrading the platform. |
---|---|
Back to process | When you have finished upgrading the platform (Remedy AR System and BMC Atrium Core), return to the Remedy ITSM Suite in-place upgrade process. The upgrade process indicates when to upgrade secondary servers. As part of that process, you will complete the zero-downtime upgrade. |
Related topic
Troubleshooting installer failure during upgrade
.
Comments
Okay. Why have us go to the Defining Queues and Configuring Threads link when that page only shows us how many to put in - which goes against setting all the threads to equal setting? May want to state to go to this document to see how to set the thread count to equal each other.
Also, what if I am on a single server environment? Do I have to set the minimum and maximum thread counts before I do the upgrade on the AR System Server?
Hello Leonard,
Sorry for the delay in response.
If the maximum value of thread count is not set to the minimum value, new thread can spawn on secondary nodes. In zero downtime upgrade scenario, if the secondary server detects a mismatch in the database version, it brings down the secondary server. Therefore, it is recommended that before upgrade, you set max=min on all the nodes so that no new threads get spawned during ZDT upgrade process. This is applicable only for ZDT upgrade on a server group environment.
Regards,
Amit
Above it was stated:
"you set max=min on all the nodes so that no new threads get spawned during ZDT upgrade process. "
It should say is that you set min=max. For example if min = 3 and max = 5, then you would set min = 5.
Do you leave the environment set at zero-downtime and upgrade the ARS, midtier, Smart Reporting, Atrium CMDB, then remove the zero-downtime parameter and do the secondary server installs or do you leave the parameter in for the whole thing?
This article and step by step process is great for getting things setup, but it doesn't necessarily complete the full loop as far as how to complete everything and when to remove it.
Hi Jamie,
You need to have the is-ZDT parameter set to true all the way until all platform serves, including secondary servers, get upgraded. You should set it to False after you complete the upgrading secondary servers step.
I have updated the following page. Please let me know if you have any further questions.
https://docs.bmc.com/docs/display/brid91/Upgrading+secondary+servers
See the section, To complete an in-place, zero downtime upgrade of the platform after upgrading secondary servers.
Thanks,
Priya
Hello,
Under "Preparing the mid tier for zero-downtime upgrade", step 9, is it for a specific version/build? It instructs to look up all settings under Cache Settings, but -
-- In our system "Definition Change Check Interval" and "Perform check" is under AR System Settings
-- The parameters/settings in steps c and e are not present anywhere
-- Setting in point b is accurate
We are on Remedy mid-tier 9.1 (201701301546), could you please check if the instructions are accurate?
Thanks,
Rohit
Hi Rohit,
Sorry for your inconvenience. I have now update the instructions. Please let me know if it looks all right.
Thanks,
Priya
Please proof and correct the following topics
configure the thread queue size
Point 2 refers to ALL Queues - Point 3 seems useless because all queues are handled by point 2.-
What is meant with "As part of the go-live stage, you must revert all the changes excluding steps 1 and 2 in this procedure."? Step2 is the major changing step.
Step 2 followed by step 3 isn't possible. You got an error message at last step "Sync Cache".
- 2c "Uncheck the Perform check box" sets the check time to 0
- Therefore the Sync step raise the error "cache manager is in dev mode, no syncing!!!)
exchange 2 and 3 seems working
Hi Stefan Hall,
Here is the response for your questions.
configure the thread queue size
Point 2 refers to ALL Queues - Point 3 seems useless because all queues are handled by point 2.-
configure the thread queue size
What is meant with "As part of the go-live stage, you must revert all the changes excluding steps 1 and 2 in this procedure."? Step2 is the major changing step.
Preparing the Midtier
Step 2 followed by step 3 isn't possible. You got an error message at last step "Sync Cache".
- 2c "Uncheck the Perform check box" sets the check time to 0
- Therefore the Sync step raise the error "cache manager is in dev mode, no syncing!!!)
exchange 2 and 3 seems working
Based on my discussion with the SMEs, I made couple of changes in the steps. Please check.
If i warnt to Upgrade from 9.1 SP2 >SP3 i should do. nothing for a zdt installation?
no isZDT flag, no thread Settings, no ... for primary AND secondary ar servers?
what about the midtier server and ZDT?
Hi Stefan Hall,
You are right.
ZDT is used to upgrade only AR and Atrium Core. Mid Tier, other components, and applications have to be upgraded in an outage window.
Regards,
Sirisha
Hi Sirisha, Only to be very sure. We upgrade our prod env from 9.1.02 to 9.1.03 and I have to prepare nothing? I install 9.1.03 at primary and all our users can work with secondary user Ar servers? Without any problems?
Changing forms, fields and data ... no problems all the time? The Ar server reboots many times, How the secondary knows the installation was finished? Stefan
Hi Sirisha Dabiru, I'm trying to follow Remedy ITSM Suite in-place upgrade process with ZDT but I cant figure out if i need to upgrade secondary after AR and Atrium Core with ZDT or if a can upgrade application (ITSM) and the upgrade secondary.
We are on 9.1 SP2
Hi Otto Von schwerin,
After the primary upgrade, you should upgrade the applications and then the secondary. Here is the sequence Remedy ITSM Suite in-place upgrade process
This documentation is as confusing as it gets. Why don't you guys make a graphic depicting the whole process or at least provide a check list like:
a) update secondary nodes first then the primary b) preparing a secondary node c) preparing a primary node d) etc.
Anyway, my questions regarding zero downtime upgrade:
Do I start upgrading with the primary or the secondary server group nodes?
Do I have to put the is-ZDT into the ar.conf of all nodes? I understand that upgrading Remedy ARS 8.1 I need to restart the server process after adding this parameter to the ar.conf, right?
I understand that before the upgrade I need to open the AR System Server Group Operation Ranking form and set the Rank of the primary node to 1 and all the secondary nodes to empty. The nodes with the empty rank need to restarted?
I understand the min and max queues need to be set equal before the upgrade?
The node being updates needs to be taken out of load balancing?
Hi Thomas Miskiewicz,
We are revamping the ZDT documentation for the 9.1.04 release which includes flow diagrams.
Regarding your questions, I will check with the SMEs and get back to you.
Regards,
Sirisha
Can you provide answers to questions that Thomas Miskiewicz had?
Hello Thomas,
Sorry about the delay in responding to your questions. The process flow diagram is added to this page that depicts the entire process.
Do I start upgrading with the primary or the secondary server group nodes? Yes, you start with the primary.
Do I have to put the is-ZDT into the ar.conf of all nodes? I understand that upgrading Remedy ARS 8.1 I need to restart the server process after adding this parameter to the ar.conf, right? No, not required to put the is-ZDT into the ar.conf of all nodes.
I understand that before the upgrade I need to open the AR System Server Group Operation Ranking form and set the Rank of the primary node to 1 and all the secondary nodes to empty. The nodes with the empty rank need to restarted?
This is also not required. The operating mode setting takes care of it.
I understand the min and max queues need to be set equal before the upgrade? Yes, that's correct. This is required on all your 7.x and 8.x servers before the upgrade.
The node being updates needs to be taken out of load balancing? Yes.
Hi Sirisha
Hope you well, so I followed steps you set up at the top but because my loadbalancer was not working well on QA i dont think it worked as it should have. However Iam going to try it out when upgrading my Production. Just wanted to know should i first set the zdt to false in ar.conf before i put my servers back into the server group? Also will I need to set the zdt to True if I am installing the any patches for 9.1.03? Lastly i ust leave the threads as they are and not change them back to whatever min was before i changed it, right?
Hi Phindiwe Moshoele,
I will check with the SMEs regarding your questions and get back to you.
Regards,
Sirisha
Hi Phindiwe Moshoele,
This topic was updated for 9.1.04 release due to the enhancements to zero-downtime upgrade. Please check the updated topic and let me know if it answers your questions.
Regards,
Himanshu
Last Question when do i change / reset the ranking. I think that your doc needs an ending part to it...
The section "To configure the ZDT parameter on the primary server" is repeated. Also, in that section it states that setting the option for is-ZDT = T will prevent restarts, but at what point to the new binaries get loaded into memory? Does that require a restart of the service?
Hi Raymond Viens,
Please check the updated topic. As the zdt parameter need not be configured, I have removed it from this topic.
Regards,
Sirisha
We are upgrading from version 8.1 to 9.1.04. Is ZDT even possible for this? Step 2c - There is no "Clear the Perform check box" on AR Server Settings, it is on Cache Settings (you should bold work "check") Step 3c - There is no "Preload Cache" in version 8.1, only "Sync Cache". In version 9.1 there are "Sync Cache" and "Preload"
Can you provide steps/documentation on how to perform ZDT from 8.1 to 9.1.04?
Hello Igor, ZDT is supported from 9.1 onwards. If you want to use ZDT from 8.1, please take a DB backup before you run the installer as installer does not support backup of files when you upgrade from 8.1.
On this page (first table) it states that ZDT is supported for "Earlier than 9.x (8.x and 7.6.04)". Rollback is not supported. Can you update the article accordingly?