FDRPAS operation
FDRPAS can swap volumes in use on a single system image, as well as those attached to multiple systems or LPARs in a shared DASD complex or sysplex, whether locally or remotely attached. Multiple volumes can be swapped concurrently.
FDRPAS tasks
FDRPAS operates as two kinds of tasks:
The active SWAP task. This task initiates the swap of one or more DASD volumes to new DASD devices. It copies the data tracks from the source volume to the target DASD, and causes the operating system to swap all I/O to the target when the DASD volumes are synchronized. A single SWAP task can swap up to 100 DASD volumes concurrently, providing that the available private area storage (PVT) is at least 10M. If you need to swap more than 100 volumes concurrently, you must start multiple SWAP tasks.
Run the FDRDEBUG program (from the FDR or FDRPAS load library) on each LPAR to display the available private storage area.Determining available PVT storage
//DEBUGSYS EXEC PGM=FDRDEBUG,PARM='FDRDEBUG SYS'
//STEPLIB DD DISP=SHR,DSN=fdrpas.loadlib <<== CHANGE DSN
//FDRDEBUG DD SYSOUT=*The PVT line in the storage table shows the available private area storage.
AREA START END SIZE SIZEDEC ---ALLOC--- --FREE<4K-- TOTAL FREE- -ALLOC HWM- ALLOC+CONV HWMPSA 00000000 00001FFF 00002000 ( 8K)SYSR 00002000 00005FFF 00004000 ( 16K)PVT 00006000 009FFFFF 009FA000 (10216K)
CSA 00A00000 00D1FFFF 00320000 ( 3200K) 948K 29% 98K 3% 2350K 73% 860K 26% 860K 26%- The update MONITOR task. This task monitors one or more offline potential target DASD devices. It detects that a swap has begun on a DASD volume and installs I/O intercepts that monitor all I/O to the source volume for updates. It also causes the operating system to swap all I/O to the target when the DASD volumes are synchronized. You can use one MONITOR task (per system image) to monitor all potential target devices or you may choose to start multiple MONITOR tasks on each system image with each task monitoring a set or range of targets. You can even start one MONITOR task per target, if you prefer. If a MONITOR task is monitoring multiple target devices, it actually starts additional MONITOR tasks (one per target device) as internal subtasks or external started tasks when a swap request is detected.
When multiple CPUs or LPARs (“system images”) have access to a volume to be moved, the SWAP task for each volume executes on only one system image, but the MONITOR task must execute on all system images with access to the volume (up to 128 system images are supported). The SWAP task also acts as the MONITOR task on the system that it is executing.
These tasks can be executed as submitted batch jobs, or as started tasks executed on-demand, although we refer to them as “tasks” in this manual.
FDRPAS volume swap
Swap of a DASD volume is very simple. An FDRPAS MONITOR task is started on each system that has access to the target device, monitoring that device. On one system, an FDRPAS SWAP task is started to initiate the swap of the online source volume to the offline target DASD device. It is usually desirable to execute the SWAP task on the system with the most update activity on the volume; however, if you are executing many swaps concurrently, you should spread the SWAP tasks across as many systems as possible.
The FDRPAS SWAP task communicates with the MONITOR tasks on all other systems to coordinate the swap operation. It verifies that every system that can see both the source and target volumes is involved in the swap. FDRPAS starts the swap only if the target device is offline to all sharing systems where the source volume is online to ensure that an active volume cannot be accidentally overlaid. However, FDRPAS cannot detect a target volume that is online to a system where the source volume is offline, so you must ensure that the target volume is not in use anywhere.
The FDRPAS SWAP task copies all allocated tracks (for some data sets, only used tracks) on the source volume to the target volume, while simultaneously detecting all updates to the source volume; updated tracks are re-copied if necessary so that the target volume eventually contains an exact image of all of the active data on the source volume. The target volume remains offline to z/OS during the copy, so that the copied data is protected until the swap is complete.
Once the copy is complete and the two devices are completely synchronized, FDRPAS completes the swap by asking the operating system to re-direct all I/O for the volume from the original source volume to the new target device on every system involved. The new device effectively replaces the original, and the original DASD volume is placed offline. All existing jobs, tasks, and users that were allocated to the volume are now allocated to the target device, although they are unaware that the swap has taken place.
When the swap is complete, the volume label on the old source volume is modified so that the operating system is no longer able to vary it online. When the system is next re-IPL’d, it finds the volume on the target device and does not attempt to use the old source volume. To be sure that this occurs, do not mark the target devices offline in your I/O configuration.
Once all volumes in a DASD subsystem have been swapped to new devices, you can power off and disconnect the old subsystem, if that is your intention. If you want to reuse the old device for some purpose, an offline INIT with the IBM ICKDSF utility (specifying NOVERIFY) can be done to give it a new volume serial, or execute the FDRPAS MONITOR TYPE=VARYONLINE function (see MONITOR RESET and VARYONLINE Statement) to modify the volume label on the original device so that it can be re-mounted, if you need to do this.
Only the source and target devices are accessed by FDRPAS during the swap. FDRPAS does not use any additional communication between systems. It does not require TCP/IP (except if you are doing a SWAPDUMP to a remote site using TCP/IP (FDRPAS SWAPDUMP to a Remote Site via TCP/IP)), VTAM, a data set on a third DASD volume, or a coupling facility.
The swap is accomplished with minimal impact on the performance of applications using the volumes being swapped. Applications continue to execute, unaware that the data movement is occurring or has completed. FDRPAS manages the copy to minimize its effect on the system. For example, inactive data sets are copied first, and tracks within active data sets that are updated are deferred until the end of the copy, so that they do not have to be copied many times. If the FDRPAS copy I/O is noticeably impacting system performance, you can request that the FDRPAS I/O be paced, adding a small delay between each I/O to allow other applications access to the DASD volumes and channels; I/O pacing can be dynamically modified during the swap process.
Swapping of a volume can be terminated at any time before the final swap without affecting the original device or any applications using it. FDRPAS ISPF panels can be used to terminate the swap. Alternately, you can cancel a SWAP task and all of the active swaps in that task terminate with an error.
Operating system swap services are invoked to perform the final swap. As a result of this swap service, the Unit Control Block (UCB) of the source and target volumes are swapped in memory, so that the original source UCB now points to the new device, and vice versa. This allows the UCB pointers of all jobs, tasks, and users who have the source volume allocated to remain unchanged and unaware that a new device is in use.
After a successful swap, the now-offline original device can be used as a point-in-time backup of the volume, at the point of the final swap. If you are using FDRPAS to migrate to new hardware, when all volumes in the old DASD subsystem have been swapped to new DASD volumes, the old subsystem can be disconnected and removed.
SWAP phases
The operation of FDRPAS is divided into five phases:
Phase 1: Initialization
This phase begins when a swap is requested by an FDRPAS SWAP task as well as during the SIMSWAPMON process. The swap request is validated and, if multiple systems are involved, the FDRPAS MONITOR tasks on the other systems are notified of the swap request. Since the SIMSWAPMON task performs the same processing as the swap process, it is highly recommended to run this to ensure that the swap process runs without errors. The SIMSWAPMON task prepares for the real swap and ensures a cleaner swap process without performing the swap.
- If CONFMESS=YES was specified, FDRPAS asks the system operator for permission to continue, via a WTOR with message FDRW01. You can also reply to this message from the FDRPAS ISPF panels. If WTOR=NO is specified, the console message is only a WTO and you must use the ISPF panels to reply.
- FDRPAS verifies that the specified source volume and target device are valid for a swap, making sure that they are the same DASD device type, that the target is offline, and that the source is eligible to be swapped. It also checks if the devices have the same number of data cylinders unless LARGERSIZE=OK is specified; in that case the target can be larger. If FDRPAS security is enabled, FDRPAS verifies that the security user id associated with the SWAP task has proper authority.
- If multiple systems have access to the source volume, the SWAP task indicates that a swap is beginning and waits for the MONITOR tasks on the other systems to acknowledge that they are ready to participate. On the FDRPAS ISPF panels, the status shows as SYNCHRONIZING.
- Each MONITOR task acknowledges that it has access to both the source and target devices, that the target is offline, and that they are ready to participate. If the target device is not offline on an LPAR that a MONITOR task is running, special checking is done by the MONITOR task to ensure that this device is the same target device as specified by the main FDRPAS process and that the device is inactive on the LPAR this MONITOR task is running on. If so, then this volume is varied offline by this MONITOR task. If a system can access the target device but not the source volume, the MONITOR task indicates that it does not need to participate.
- When the proper number of MONITOR tasks have acknowledged that they are ready to participate, the SWAP task proceeds. If the expected number of systems have not acknowledged within a time limit, this probably means that an FDRPAS MONITOR task for the target device was not running on all required systems, that the target was not offline on one or more systems, or that one or more systems does not have access to the target. You must run a MONITOR task on every system that has access to the source volume, even if it is offline, and those systems must also have access to the target device. FDRPAS asks if it should continue with the swap even though the expected number of systems are not participating by issuing message FDRW68 (unless you specify NONRESPONDING=FAIL).
- Note that the FDRW68 message is issued as a WTOR, to which the system operator can reply, by default. You can also display and reply to the message by using the FDRPAS ISPF panels. Optionally, you can change the message to a simple WTO so that the system operator cannot reply; in this case the ISPF panels must be used to reply. If the system operator is not involved in the swaps, the WTOR=NO operand is recommended to prevent erroneous replies.
Phase 2: Activation
The SWAP task signals that Phase 2 has begun. On each system, FDRPAS temporarily suspends all application and system I/O to the source volume and install an I/O intercept to monitor updates to the source volume. When this is done on all systems, I/O is allowed to proceed. The swap has now begun. The time required to complete Phase 2 varies depending on the number of systems involved. On the FDRPAS ISPF panels, the status now shows as ACTIVE.
Phase 3: Copy
The SWAP task copies data tracks from the source volume to the target device, reading and writing up to 15 tracks per I/O.
- The first pass of the Phase 3 copy copies all tracks on the source volume. Only tracks currently allocated to a data set are copied, plus tracks in the VTOC, VTOC index, VVDS and volume label. For Physical Sequential (PS), Partitioned Organization (PO), and VSAM data sets, only used tracks are copied unless those data sets are allocated to some job or task at the beginning of the swap, in that case all allocated tracks are copied.
- While the Phase 3 copy is progressing, the I/O intercepts on each system are monitoring I/Os to the source volume to identify tracks that are updated. At the end of each pass of Phase 3, a consolidated list of updated tracks is collected (see Phase 4) and an additional pass of Phase 3 is made to re-copy those updated tracks. These additional Phase 3 passes continue until the number of tracks remaining to be copied is small.
- Before a track is copied, FDRPAS checks to see if the I/O intercept on the system running the SWAP task has determined that the track was updated during the current pass, and defers copying the track until the next pass. This avoids unnecessarily copying tracks that just need to be re-copied.
Phase 4: Update Consolidation
At the end of each Phase 3 copy pass, Phase 4 is entered and the SWAP task requests a list of updated tracks from each MONITOR task. I/O to the source volume is suspended briefly on all systems while this information is collected. A consolidated list of tracks updated on all systems is formed. FDRPAS determines if it can complete the swap:
- If the number of tracks in the list is above a threshold, Phase 3 is re-entered to re-copy the updated tracks. Note that after every Phase 3 pass, the threshold value is increased, in case the rate of updates to the source volume is very high.
- If the number of tracks in the list is below the threshold or there are no updated tracks in the list, then FDRPAS is ready to complete the swap.
- If CONFIRMSWAP=YES was specified on the SWAP statement, then you do not want the swap to complete until you tell it to, so FDRPAS simply re-enters Phase 3 to copy the updated tracks (we recommend not to use CONFIRMSWAP=YES unless you need to complete the swap of many volumes at the same time). This continues until you confirm the swap (if the number of updated tracks again rises above the threshold, the volume no longer is “ready to swap” until it falls again). If there are no tracks in the update list, FDRPAS simply waits for an interval and test for updates again. You can confirm the swap in two ways: the FDRPAS ISPF panels can be used to monitor the progress of the swaps and confirm the swap of one or more volumes, or you can submit a MONITOR TYPE=CONFIRMSWAP job to wait for one or more DASD volumes to become ready for completion and automatically confirm the swap. CONFIRMSWAP=YES does not result in any console message or WTOR.
- If CONFIRMSWAP=NO was specified or defaulted, then FDRPAS automatically completes the swap as soon as the number of updated tracks in Phase 4 falls below the current threshold.
- On every system, FDRPAS disables all application and system I/O to the source volume, then enters Phase 3 for one last pass to copy the remaining updated tracks (unless the updated track list is empty). Depending on the current value of the threshold and the number of tracks in the list, I/O is suspended for a few seconds. This quiesce time depends on the number of participating systems and the number of updated tracks to be copied.
Phase 5: Swap Completion
At this point the source and target devices are completely synchronized. On every system, FDRPAS invokes operating system services to swap the devices. The volume now appears to be mounted on the target device that is now online, all future I/O is directed to the target device, and all jobs, tasks and users that have the volume allocated are now pointed to the target device. The original source volume is placed offline and its volume label is modified so that it cannot be accidentally placed online again. FDRPAS removes its I/O intercepts on all systems and re-enable I/O to the volume. The swap is complete. On the FDRPAS ISPF panels, the status shows as COMPLETED but only for swaps that previously had a status of ACTIVE.
Automatic swap termination
If the MONITOR task on any system fails to respond in any phase of the swap (except Phase 5), the SWAP task automatically terminates the swap. This probably means that a MONITOR task has abnormally terminated or been canceled, or a system involved in the swap has crashed or been shut down.
Similarly, if the SWAP task is abnormally terminated or canceled, or the system executing the SWAP task crashes or is shut down, the swap is terminated.
If an I/O is issued to the source volume on any system that contains Channel Command Words (CCWs) that are not recognized by FDRPAS, the swap is terminated, since FDRPAS cannot tell if that I/O has updated the source volume, or what tracks it has updated. This probably means that the source volume DASD subsystem supports special vendor-specific CCWs for functions that are unknown to FDRPAS. In this case, FDRPAS prints some diagnostic information about the suspect CCW chain and the job that issued it. You should contact BMC Support with this printout so that we can attempt to identify the CCWs and enhance FDRPAS to handle them properly. If you can determine that the job has used functions that are restricted during an FDRPAS operation (such as Concurrent Copy (CC), see FDRPAS Special Considerations), you may be able to re-execute FDRPAS at a time when those functions are not in use.
Eligible volumes for swap
All volumes are eligible to be swapped except.
The system residence (IPL) volume can be swapped, but you must be sure to update your IPL parameters on all affected systems with the new IPL address before the next IPL.
However, you should read FDRPAS Special Considerations carefully; since there may be steps you need to take before moving certain volumes.
Point-in-time backups
When FDRPAS is used to create a point-in-time backup (the SWAPDUMP statement), the operation of FDRPAS is similar to the operation of a normal swap except that the volumes are not swapped at the end of the operation. FDRPAS simply terminates, leaving the target device with an exact copy of the source volume (except that the label is changed from “VOL1” to “FDR3”) at the point that FDRPAS ended.
Start an FDRPAS SWAPDUMP operation for all volumes involved in the backup well before the backup is to be taken to give FDRPAS time to synchronize all those volumes. Volumes involved in a SWAPDUMP backup cannot also be involved in a true swap, and no more than one SWAPDUMP can be in operation for a given volume at one time.
Normally, you want to specify the CONFIRMSPLIT=YES operand on the SWAPDUMP statement. This operates identically to the CONFIRMSWAP=YES operand of the SWAP statement, causing FDRPAS to continue to operate even when the volumes are synchronized, recopying updated tracks as necessary to maintain the synchronization. You must “confirm” the volumes through the FDRPAS ISPF interface or by submitting a MONITOR TYPE=CONFIRMSPLIT statement that terminates FDRPAS and make the offline target volumes available for dumping when you are ready to take the backup of the volumes.
FDRPAS SWAPDUMP supports FDRINSTANT backups with FDR and FDRDSF, and data set copies with FDRCOPY. It does not support FDRABR®.
CONFIRMSWAP and CONFIRMSPLIT
By default, a SWAP operation (to actually move a volume) and a SWAPDUMP operation (to create a point-in-time backup) complete automatically as soon as the source volume and target device are synchronized or when only a small number of data tracks remain to be synchronized. No operator or user intervention is required to complete the operation.
However, the CONFIRMSWAP=YES operand (for SWAP) and CONFIRMSPLIT=YES operand (for SWAPDUMP) can be used to allow the operator or user to control when the operation on a given DASD volume completes. If these operands are specified, then FDRPAS enters an “idle” state when the devices are synchronized or close to synchronization. In this state, FDRPAS continues monitoring the source volume for updates and re-entering Phase 3 (as documented earlier) to periodically copy the updated tracks, to keep the devices in close synchronization. However, it continues to do this indefinitely until it is instructed to complete the operation.
Why would you want to do this? For a SWAP, you generally do not want to use CONFIRMSWAP=YES unless you have some special reason for wanting to control when the swap to the new device actually occurs. When swapping a single volume, there is rarely any reason to do so, since you usually want the swap to complete as soon as possible. Even when swapping many volumes in parallel, you usually want to let each volume swap as soon as it is synchronized. However, if you have some reason that you need to co-ordinate the actual swaps, you can use CONFIRMSWAP=YES. In most cases, you should omit CONFIRMSWAP=YES.
For a SWAPDUMP, CONFIRMSPLIT=YES may make sense, since it allows you to control the time that the point-in-time backup is frozen. It may be especially useful when creating point-in-time backups of many DASD volumes, so that they can all be frozen at approximately the same time.
CONFIRMSWAP=YES and CONFIRMSPLIT=YES do not result in any console messages or WTORs (although some users seem to expect that they do). There are two ways to tell FDRPAS to complete the operation:
- If using the FDRPAS ISPF panels to monitor FDRPAS operations, the panels tell you which SWAP and SWAPDUMP tasks have used the confirm operand, and also tell you when each volume has reached synchronization and is ready to confirm. You can then enter a command on the panel to confirm one or more DASD volumes and complete their operations.
- If you want to automate the process, you can use an FDRPAS job or started task with the MONITOR TYPE=CONFIRMSWAP or TYPE=CONFIRMSPLIT statement (MONITOR CONFIRM Statement). This is followed by one or more MOUNT statements (MONITOR CONFIRM MOUNT Statement) identifying DASD volumes. When all of the volumes identified are in the “ready to confirm” state, they are all confirmed automatically. This is an easy way to automatically complete the SWAP or SWAPDUMP operation for a set of volumes at the same time.
I/O Pacing
By default, FDRPAS does I/O to the source and target devices as rapidly as the hardware and operating system allow. Up to 15 tracks are read or written per I/O (unless overridden by BUFNO=). This allows FDRPAS to complete the swap of a volume very quickly. The swap of a 3390-3 typically completes in one or two minutes, depending on the number of tracks to be copied, source and target device types, and so on.
If there is I/O activity on the volume from other applications or the system, the FDRPAS I/O may have an impact, causing the other I/O to be delayed or elongated. In most cases, this degradation is not noticeable; batch jobs that are using the volume may run a little longer and online users may see a slight increase in response time. Since the degradation vanishes as soon as the swap is complete, there is usually no need to be concerned about it. If you are swapping volumes to newer, faster hardware, response time improves as soon as the swap is complete, so it is desirable to complete it as quickly as possible.
However, you may have an environment where online response time or batch service times are extremely important so that the FDRPAS degradation is not acceptable. The obvious solution is to run FDRPAS off-hours when the impact is not noticeable, but if that is not practical, FDRPAS includes I/O pacing options to reduce the impact of its I/O.
FDRPAS I/O pacing works by inserting a time delay between WRITE I/Os to the target device. This also causes delays between READ I/Os on the source volume (note that if the target hardware is significantly faster than the source, it may require large pacing delays before the source I/O is delayed).
Static I/O Pacing
is invoked by specifying the PACEDELAY=nn operand on the SWAP or SWAPDUMP statement. This introduces a fixed delay of “nn” hundredths of a second between writes. The PACEDELAY= value can also be interactively modified from the FDRPAS ISPF panels, even if it was not specified when the swap was started. Therefore, if the FDRPAS I/Os are causing unacceptable degradation, you can change the pacing values up and down from the panels until you are satisfied with the results.
Dynamic I/O Pacing
is invoked by specifying PACING=DYNAMIC on the SWAP or SWAPDUMP statement. When in use, FDRPAS uses an algorithm to gauge the impact of the FDRPAS I/Os on queue lengths and I/O delays on the source volume. Every 15 seconds, FDRPAS may increase or decrease the PACEDELAY= value in use (from 0 to 50), depending on recent results. If you also specify the PACEDELAY= operand, it is used as the initial pacing value; otherwise the initial value is determined by FDRPAS when the swap starts (the maximum initial value is 20). You can observe the pacing value from the ISPF panels, and you can change it if desired (FDRPAS starts adjusting the pacing from the new value).
Terminating FDRPAS
FDRPAS SWAP tasks terminate automatically when all volumes requested by MOUNT statements have been processed (successfully or unsuccessfully).
FDRPAS MONITOR tasks with DYNMON=NO terminate automatically when all target devices being monitored (as specified on MOUNT statements or added dynamically) have been successfully swapped. It determines this by checking if the target devices are now online, so varying them online also terminates the MONITOR task. However, if the MONITOR is monitoring a large number of target devices, it is unlikely that they are all swapped, so it may not terminate automatically. FDRPAS MONITOR tasks with DYNMON=YES only terminate automatically based on the DURATION= operand, if specified. If DURATION= is not specified, then they do not terminate automatically.
You can specify a DURATION=nn operand on a MONITOR statement. When the MONITOR task has accumulated “nn” minutes of idle time (during which it is not participating in the swap of any volume), it terminates automatically.
FDRPAS also supports the console STOP command (abbreviated P), specifying the job name or started task name of an FDRPAS SWAP or MONITOR task. For example:
If you STOP(P) a MONITOR task, it terminates within a few seconds if it is idle. If it is participating in one or more active swaps, those swaps are allowed to complete, but the new swap requests are not accepted.
If you STOP(P) a SWAP task, all active swaps are allowed to complete, but any requested volumes that have not yet started do not start. Messages are issued to identify the volumes that were bypassed because of the STOP(P).
If you must terminate active swaps for some reason, take these steps in this order, until the swaps are terminated.
- Issue a STOP(P) command to the SWAP task, which allows currently active volumes to complete. If you cannot wait for active swaps to finish, use the ISPF panels to ABORT the active swaps or issue a CANCEL(C) command. When all swaps have terminated, you can issue STOP(P) commands to the MONITOR tasks if they have not already terminated.
- If STOP(P) does not work, issue a console CANCEL(C) command to the SWAP task. When all swaps have terminated, you can issue STOP(P) commands to the MONITOR tasks if they have not already terminated.
- If the SWAP task does not terminate, then issue a CANCEL(C) command for each MONITOR task. Because of cancel protection (see "CANCELPROT="), you actually need to issue two CANCEL(C) commands for each task. However, the SWAP task, if still active, does not know that the MONITOR tasks have terminated immediately; they continue copying data until the end of the current copy pass, at that point each swap fails because of the missing MONITOR tasks.
A CANCEL(C) command causes the FDRPAS SWAP or MONITOR task to enter a cleanup routine for each active swap. It may take a minute or so to cleanup all of the active subtasks, so FDRPAS may not terminate immediately. Since a second CANCEL(C) causes the cleanup to be bypassed, which may leave active volumes in an unknown state, FDRPAS rejects any more CANCEL(C) commands while it is in this cleanup until two minutes have passed. After two minutes, another CANCEL(C) is accepted, allowing you to terminate FDRPAS even when it is hung in the cleanup routine. If the CANCEL(C) commands do not work, you can use the console FORCE command to terminate the FDRPAS address space.
FDRPAS console status displays
You can display the status of the active volumes in an FDRPAS SWAP task on the console by issuing the console MODIFY(F) command like this:
F job,STATUS (or just STA)
FDRPAS responds with messages on the console and in the job log of the FDRPAS job or started task with the status of any volumes currently being swapped, similar to the information displayed by the FDRPAS ISPF interface. For example,
F job,STATUS
FDRW08 VOLSER UNIT TARG % PASS TOCOPY COPIED UPDATE STATUSFDRW08 ------ ---- ---- --- ---- ------ ------ ------ --------------FDRW08 SH20CC 20CC 20CB 10 1 15017 1545 0 ACTIVE SWAPFDRW08 SH20C6 20C6 20C5 19 1 15078 2865 0 ACTIVE SWAP |
Reset service class
You can reset the service class of an FDRPAS job or started task during execution by issuing a console MODIFY(F) command.