FDRPAS on multiple systems


Multi-system operation

When multiple z/OS system images can access the DASD volume to be moved, there are some additional steps, since the swap must be coordinated on all system images. All system images must be monitored for updates to the volume during the swap, and the final swap to the new device must be conducted simultaneously on all images. The sequence is as follows:

  • If using GENSWAP, you can use the Just-In-Time monitors to generate and submit the MONITOR tasks. Refer to FDRPAS GENSWAP for information on GENSWAP.
  • Ensure that an FDRPAS MONITOR task is running on all systems that have access to the target device (even if it does not have the source volume online) or add the PASJOB DD statement with the appropriate control statements to start the MONITOR tasks with the SWAP job. Each MONITOR task can be directed to monitor only a single target device, a range of potential target devices, or have the target devices added dynamically. A DASD device can connect to up to 128 systems, so FDRPAS supports up to 127 MONITOR tasks for a given SWAP.
  • Start the FDRPAS SWAP task on any system, specifying the volume to be swapped and the output (target) device. For best performance, the SWAP task should run on the system with the highest level of update activity on the volume to be swapped.

Important

If running with a MAXTASKS= value greater than 64 and have verified that you have sufficient PVT storage (see Determining Available PVT Storage), contact BMC Support before running the SWAP task to discuss region size requirements of all systems that have access to the source and target devices and the best system to run the SWAP task.

  • After validating the swap request, the FDRPAS SWAP task indicates that the swap is pending.
  • On the other system images, the FDRPAS MONITOR tasks recognizes that the swap is pending and indicate that they are ready to participate in the swap. If the MONITOR task is monitoring only a single target device, that task handles the entire swap process. If the MONITOR task is monitoring multiple target devices, the MONITOR task starts a separate FDRPAS task for each volume when the swap begins.
  • When the required number of MONITOR tasks have acknowledged their participation, the SWAP task signals that the swap has begun. The SWAP task installs the I/O intercept on its image to monitor updates.
  • The MONITOR tasks recognizes that the swap has begun and install the I/O intercept on their images to monitor updates.
  • When all MONITOR tasks have indicated that the intercepts are installed, the SWAP task begins copying tracks from the original device to the target device.
  • The FDRPAS intercepts on each system monitor all I/O operations to the original device and note all of the tracks that have been updated. Updated tracks are copied (or re-copied, if they were previously copied) to the new device.
  • When the copy is complete or the number of tracks remaining to be copied is below a threshold, FDRPAS signals all MONITOR tasks to quiesce all I/O to the original device. The remaining tracks, if any, are copied while all other I/O is quiescent. At this point, the target device is an exact copy of the source volume.
  • The SWAP task now signals all MONITOR tasks to swap all system pointers on all system images so that all future I/O to the volume is directed to the new device. The original device is placed offline and the volume label on that device modified so that it cannot be accidentally placed online.
  • I/O to the new device is re-enabled, all I/O intercepts are removed, and the SWAP task terminates.

System Determination

In a multi-system environment, one or more FDRPAS MONITOR tasks must be executed on every system image that has the source volume online; one of those MONITOR tasks must monitor the target device if it is in the I/O configuration of that system. If some systems are excluded, those systems are not aware that FDRPAS has moved the volume to a new device, and FDRPAS is not aware of updates to the volume that occur on the excluded systems during the swapThis could have serious consequences, including data corruption and data loss.

If you have systems in your complex that have the source volume online but do not have access to the target device, you must not attempt to swap the volume to that device.

FDRPAS attempts to determine how many systems have access to the source volume, in order to protect you against potentially disastrous errors in setting up the FDRPAS swaps. Depending on the DASD hardware involved, FDRPAS may be able to identify the number of systems accessing the source volume and the CPU serial number of each system. However, if the number of systems cannot be determined, or if you need to exclude certain systems from participating in the swap of a given volume, you need to provide input to FDRPAS. Here are the steps that FDRPAS takes:

  • If the source DASD supports Query Host Access (QHA), FDRPAS is able to determine how many system images have access to the source volume.
  • Once the SWAP task signals that the swap is beginning, the MONITOR tasks on each system registers their participation. The SWAP task verifies that the proper number of systems are participating. If the CPU serial numbers of the systems are known, it verifies the serial number of each MONITOR task against the list of expected serials.
  • If the expected number of systems (or CPU serials) do not participate, then FDRPAS issues message FDRW68 indicating this condition. If “NO” is replied, the swap is terminated (if you specify NONRESPONDING=FAIL, then a reply of “NO” is assumed and no FDRW68 message is issued). You may also reply “RETRY”, which causes FDRPAS to wait some additional time to see if the expected number of systems finally participate. The FDRW68 message can be issued as a WTOR to the system operator or you can display and reply to the message from the FDRPAS ISPF panels. You should try replying “RETRY” at least once, in case some MONITOR tasks were delayed.

In the most common configuration, where the source volume and the target device are in the I/O configuration of every system in your complex, you simply need to start a MONITOR task for the output device on every system, and the rest is automatic. If FDRPAS identifies systems that did not register, then the MONITOR task is not executing on those systems; just fix that error and try again.

The process is more complex when the source volume and/or the target device are not in the I/O configuration of some of your systems, or the source volume is offline on some systems, but even then, FDRPAS attempts to automate the process:

  • If the source volume is not in the configuration or is offline on some systems, but the target device is in the configuration, you should execute a MONITOR task on those systems. The MONITOR task sees the swap request, determines that it does not need to participate in the swap because the source volume is not in use and communicate that to the SWAP task. The SWAP task counts this as a responding system but excludes it from swap processing.
  • If the target device is not in the configuration of some systems, but those systems are connected to the system executing the SWAP task via GRS (a GRS complex), then you should execute a MONITOR task with DYNMON=YES on those systems. FDRPAS uses a series of cross-CPU enqueues (major names FDRPAS, and FDRPASQ) to communicate that those systems do not need to participate.
Warning

If some systems have the source volume online but do not have access to the target device, do not attempt to swap that volume unless you vary the source volume offline on those systems first. It will not be accessible on those systems after the swap.

Only in the situation where some systems have the source volume offline but do not have access to the target device and are not connected to the swapping system by GRS or MIM, do you need to take special actions to allow FDRPAS to continue. If this is the situation, contact BMC Support for further information on how to proceed.

If you get the FDRW68 message indicating that there are non-responding systems, you should reply “RETRY” at least once to be sure that a slow system was not prevented from replying. If the FDRW68 is reissued, then you should reply “NO” to terminate the swap, investigate the cause, and update the FDRPAS input statements or start the proper FDRPAS MONITOR tasks to correct the error.

Therefore, in many installations, all devices in all DASD subsystems are defined to all systems in the complex, so executing FDRPAS is simply a matter of making sure that proper FDRPAS MONITOR tasks are running on every system.

In some installations, such as service bureaus and outsourcing sites, certain devices in DASD subsystems may be deliberately omitted from the I/O configuration on some systems, to prevent inadvertent access. In these installations, more care must be taken to be sure that the requirements for FDRPAS are met.


 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*