FDRPAS and IBM GDPS/PPRC HyperSwap
Geographically Dispersed Parallel Sysplex/Peer-to-Peer Remote Copy (GDPS/PPRC) HyperSwap is an IBM offering that non-disruptively swaps I/O activity from the primary volumes to the secondary volumes of PPRC mirrored pairs, when a problem occurs on any primary volume. Automation is used to swap all mirrored volumes at the same time.
A GDPS/PPRC HyperSwap cannot occur at the same time as an FDRPAS SWAP. FDRPAS cannot SWAP a volume while it is eligible to be swapped by HyperSwap. FDRPAS offers facilities to disable HyperSwap for the minimum amount of time while FDRPAS does its UCB swaps, and then re-enable HyperSwap. With these facilities, HyperSwap remains active while FDRPAS does all of the copying of data from the source volumes to the target volumes. HyperSwap is disabled by using the HYPERSW DISABLE command, which is much faster than the older HYPERSW OFF command. HYPERSW DISABLE just tells GDPS not to do a HyperSwap, after which HYPERSW ON tells GDPS that it is okay again to do a HyperSwap when needed; while HYPERSW OFF deconstructs the HyperSwap environment, after which HYPERSW ON has to rebuild the HyperSwap environment.
This section, FDRPAS and IBM GDPS/PPRC HyperSwap, deals with GDPS HyperSwap. See FDRPAS and IBM Basic HyperSwap for information on Basic HyperSwap.
In most cases, the number of volumes under the control of GDPS/PPRC HyperSwap is very large. To do an FDRPAS SWAP of volumes that are being managed by GDPS/PPRC HyperSwap, follow the procedure in FDRPAS for Large Scale Synchronized Migration (Using CONFIRM), specifying CONFIRMSWAP=YES on the SWAP jobs. In one of the MONITOR TYPE=CONFIRMSWAP jobs, include special steps detailed below to disable and re-enable HyperSwap at the appropriate times. In this section, the MONITOR TYPE=CONFIRMSWAP job with the special steps is called the “special 4-step job”. Run standard TYPE=CONFIRMSWAP jobs on the other LPARs running SWAP jobs. With CONFIRMSWAP=YES, FDRPAS allows HyperSwap to remain enabled during the data copy phase of each volume and until the SWAP operation is completed by MONITOR TYPE=CONFIRMSWAP; otherwise a volume being managed by HyperSwap would not be processed.
Terminology
This section uses some terms that have special meanings for GDPS:
Controlling system(s)
One or two LPARs in the GDPS sysplex that manage the GDPS functions. Also known as the “K-system(s)”. The controlling system(s) share a minimum of DASD and other resources with the production systems, so that they are not affected by problems that may cause outages on the production systems. If there are two controlling systems, one is the master controlling system and the other is the alternate controlling system.
Production system
Any LPAR in the GDPS sysplex that is not a controlling system. It can be a system that runs a production workload, or a test or development system.
Rules, Recommendations, and Considerations
The following rules, recommendations, and considerations apply to this procedure.
- Most installations that use GDPS consider it critical to keep the time that GDPS is disabled to an absolute minimum. The procedure in FDRPAS for Large Scale Synchronized Migration (Using CONFIRM) fulfills this requirement. With this procedure, HyperSwap is disabled only once during a migration of up to 15,000 volumes.
- In order for full duplexing to be in effect as soon as the FDRPAS SWAP tasks are complete, it is necessary to add the FDRPAS target devices to the GDPS/PPRC configuration before the FDRPAS SWAP tasks start. While FDRPAS is copying the data from the source device to the target device, PPRC copies the data from the target (which is a PPRC primary device) to its PPRC secondary device. The double copy causes the FDRPAS SWAP to run slower than it would otherwise, but it is necessary in order to have the target device synchronized with its PPRC secondary when GDPS is re-enabled after the FDRPAS SWAP.
- MONITOR tasks must be run on the GDPS controlling system(s), since the volumes being swapped are online on these LPARs.
- FDRPAS has a rule that the target devices must be offline on the LPAR running the SWAP task and on all LPARs running MONITOR tasks. In general, GDPS has a rule that all of the PPRC primary volumes must be online in the controlling system(s). In GDPS 3.9 and above, there is an exception that PPRC primary volumes are allowed to be offline in the controlling system(s) if they are marked as Reserve Storage Pool (RSP) volumes. Therefore, the FDRPAS target volumes should be initialized as RSP volumes before starting the SWAP tasks.
This is different from the usual procedure for FDRPAS, which does not require the target devices to be initialized in any way before the SWAP.
To initialize volumes as RSP, run the INIT command of ICKDSF with the RESERVED parameter. Do not specify a reserve storage pool name with the OWNERID or RESERVEPOOLNAME parameters, since the target volumes are not really in a reserve storage pool. The RESERVED parameter was added to ICKDSF by PTF UK70219, available 2011/08/04, for APAR PM16856.
If you are running GDPS at a lower level than 3.9, or if you do not have the support for the RESERVED parameter in ICKDSF INIT, you must start out with the target volumes as offline but not RSP. This may result in an alert being raised by GDPS monitoring on the controlling system(s).
- If you use the I/O timing facility as a trigger for HyperSwap, then you should disable it while running FDRPAS by issuing the console command:
SETIOS MIH,IOTHSWAP=NO
At the end of each copy pass, FDRPAS temporarily suspends all I/O to the source volume. Under some conditions, this can cause I/O requests to remain on the queue for longer than the I/O timing limit, especially if you have set a short limit (one minute or less). This condition is not an error and should not be allowed to trigger HyperSwap.
After you are finished running FDRPAS, you can re-enable I/O timing as a trigger for HyperSwap by issuing the console command:
SETIOS MIH,IOTHSWAP=YES
SWAP and MONITOR Jobs
The SWAP and MONITOR jobs should be generated with GENSWAP following the procedure in FDRPAS for Large Scale Synchronized Migration (Using CONFIRM), with the following considerations:
- The example in FDRPAS for Large Scale Synchronized Migration (Using CONFIRM) is for 15,000 volumes, but the setup is the same for any number of volumes.
- The example is set up for SWAP/SWAPDUMP jobs on six LPARs, MONITOR tasks on a total of ten LPARs, and MAXACTIVESWAPS=10. These are all just examples. Adjust all these values appropriately for your environment.
- Specify SWAP instead of SWAPDUMP.
- Specify CONFIRMSWAP=YES instead of CONFIRMSPLIT=YES.
- Do not specify CPYVOLID=YES. Instead, if the target volumes are initialized as RSP volumes before the SWAP, specify LABEL=SWAP so that after the SWAP, the source devices become RSP volumes and GDPS considers it valid for them to be offline.
Example of SWAP command:
SWAP TYPE=FULL,MAXTASKS=100,CONFIRMSWAP=YES,LABEL=SWAP, LARGESWAP=16000,SWAPID=&&&,ALLOWPAV=YES,LARGERSIZE=OK
Completing the SWAP Operation
Once ALL of the volumes have reached synchronization, the SWAP operation can be completed. “Completed” means disabling HyperSwap, issuing the actual UCB swaps, and re-enabling HyperSwap. On one of the systems running SWAP jobs, run the special 4-step job described below. On any other LPARs running SWAP jobs, run a regular MONITOR TYPE=CONFIRMSWAP job as shown in FDRPAS for Large Scale Synchronized Migration (Using CONFIRM). A single GENSWAP run can generate the special 4-step job and all of the regular MONITOR TYPE=CONFIRMSWAP jobs.
Special 4-Step Job
Step 1: (CONFIRM) of the Special 4-Step Job.
The first step (1) is the regular MONITOR TYPE=CONFIRMSWAP. This step checks that all of the SWAP tasks are active, to make sure that none of the volumes have been left out or have failed. It tells the SWAP tasks to start copying the JES and coupling volumes that had been deferred until this time. It then waits for all of the source volumes to reach the “ready to confirm” stage, in order to coordinate among the SWAP tasks and ensure that all of the volumes are swapped at the same time. Finally it confirms all of the SWAP tasks.
Step 2: (DISABLE) of the Special 4-Step Job.
Even when all of the volumes have been synchronized and confirmed, the SWAP tasks do not complete as long as the volumes are eligible to be swapped by HyperSwap. The second step (2) of the special job uses program FDREMCS (FDR-Extended-MCS-Software-Console-FDREMCS) to issue a MODIFY(F) command to NetView to disable HyperSwap. FDREMCS monitors the command responses for 60 seconds, looking for the GEO551I message indicating that HyperSwap has been disabled. (This is an exception to the general rule in FDR-Extended-MCS-Software-Console-FDREMCS that FDREMCS monitors responses for only 5 seconds, and that it has no way of knowing if the command specified was completed successfully or had an error.) When this message appears, FDREMCS sets a flag to notify FDRPAS. Meanwhile, the SWAP tasks cycle, testing for this flag every few seconds. As soon as the flag is set, the SWAP tasks complete and terminate.
- Change netview to the name of the NetView started task on the LPAR where the job runs.
- NetView parameters may need to be modified to accept z/OS MODIFY(F) commands from a console called “FDREMCS” (or whatever you specify for CONSOLE=); consult the NetView manuals for details.
- Security rules may need to be modified to allow the EMCS console to issue the MODIFY command; for details, consult the z/OS manuals and the manuals for RACF or your security system.
Step 3: (WAITTERM) of the Special 4-Step Job
The third step (3) of the special job step waits for the SWAP tasks to terminate on all of the selected volumes. A //*SWAPNEXT statement precedes this step to tell GENSWAP to generate all of the same MOUNT commands as for the CONFIRM step.
Step 4: (ENABLE) of the Special 4-Step Job
The fourth and final step (4) of the special job uses program FDREMCS to issue a MODIFY(F) command to NetView to re-enable HyperSwap.
Change netview to the name of the NetView started task on the system where the job runs.
GENSWAP Example
Here is an example of the GENSWAP run to generate the special 4-step job and all of the regular MONITOR TYPE=CONFIRMSWAP jobs to complete the swap operation in a GDPS HyperSwap environment.
- A //*SWAPNEXT statement is used before each CONFIRM job to instruct GENSWAP to generate all of the same MOUNT statements for each job.
- An override of the //PAS.INTRDR DD statement directs the output to SYSOUT for viewing instead of submitting the output directly to the internal reader for execution. Be sure to review the generated jobstream well in advance of the time to run the actual CONFIRM job. When it is time to run the actual CONFIRM job, remove the //PAS.INTRDR DD statement to allow the generated jobs to be submitted to the LPARs for execution.
This example can be found in the JCL library installed with FDRPAS with member name PA32006A.
//GENCNFRM EXEC PASPROC,LIB=fdrpas.loadlib
1 //*SWAPJOB
//CONFIRM EXEC PASPROC,LIB=fdrpas.loadlib
2 //DISABLE EXEC PASPROC,PROG=FDREMCS,LIB=fdrpas.loadlib
COMMAND=MODIFY netview,HYPERSW DISABLE
3 //*SWAPNEXT
//WAITTERM EXEC PASPROC,LIB=fdrpas.loadlib
4 //ENABLE EXEC PASPROC,PROG=FDREMCS,LIB=fdrpas.loadlib
COMMAND=MODIFY netview,HYPERSW ON
//CONFIRM EXEC PASPROC,LIB=fdrpas.loadlib
//CONFIRM EXEC PASPROC,LIB=fdrpas.loadlib