FDRINSTANT for SnapShot Consistent Backup Support

FDRINSTANT supports Consistent Backup of DASD by using Hitachi Vantara Consistent Split of ShadowImage volumes. Consistent Split is also referred to in some Hitachi Vantara documentation as “At-Time Split”.

Consistent Split allows you to split a number of ShadowImage or TrueCopy volumes at the same point-in-time, in other words, at the same point of I/O consistency. I/O consistency means that all the backups of the split volumes are at the same point in I/O processing, and there is no possibility that some of the volumes contain the results of I/Os issued after that point.

I/O consistency is particularly important to applications that execute dependent write I/Os. Dependent writes are issued by the application in a particular order, and the next write in the sequence is not done until the previous write has successfully completed. Many applications, and in particular database management systems (DBMSs such as DB2) use dependent writes to ensure data integrity in the event of a hardware or software failure.

To execute a database update, the DBMS may write information about the update to the log data set, then it will do the actual database update (one or more I/Os), then it will update the log to indicate that the database update was successfully done (“committed”). If the DBMS is restarted after a failure, it can tell which database updates were completed and which were not (“in-flight”) and do the proper recovery.

During a normal backup, if the DBMS is still active, the volumes may not be at a consistent point-in-time. For normal FDRABR backups, the volumes may be processed many minutes apart. Even with FDRINSTANT backups, the splits of the volumes may be separated by seconds. If these backups must be restored, the DBMS may not be able to restart the databases because the logs and databases are not consistent. The usual means of addressing this problem is to suspend updates to the database during the backup process (in some releases of DB2, this is done with the LOG SUSPEND command).

FDRABR support for Hitachi Vantara Consistent Split prevents dependent I/O from being issued by the application during the split process, thus ensuring the integrity and consistency of the data on the ShadowImage volumes. This results in a copy of the data that is restartable after a restore, even though the DBMS may have been updating the databases during the split. There is no need to suspend updates to the database during the Consistent Split operation.

When to use Consistent Backup

Consistent Backup is most appropriate when the volumes involved contain only the data sets that are used by a single DBMS or database application. Be sure to include volumes containing the log files used by that DBMS or application.

For volumes containing other types of data sets, Consistent Backup may or may not be appropriate. If the data sets or applications that use them are tolerant of system failures and provide appropriate recovery when restarted, and related data sets are spread across multiple volumes, then you may be able to use Consistent Backup on those volumes without quiescing the application. However, this recovery should be tested to be sure that it works properly.

Although Consistent Backup insures I/O consistency between volumes, it does not insure that a sequence of I/Os issued to a given volume is complete. For example, if a VSAM file is undergoing a CA split, which requires multiple I/Os to the cluster and several I/Os to the VVDS, the split may still capture an image of the volume at a point in the middle of those I/Os, which may make the cluster unusable when restored.

Most data sets and most applications do not provide the dependent writes and recovery provided by database systems, so Consistent Backup does not insure that data sets, VTOCs, and VVDSs are error-free if you must restore a backup created after a consistent split (or even a normal split). If a split occurs in the middle of a sequence of update I/Os, data sets may contain improper formatting or I/O errors, and the VTOC or VVDS may have partially completed updates that will cause errors or failures. Of course, these sorts of problems may also occur if you do a normal backup of an online volume while it is being updated; Consistent Backup does not improve this exposure.

Unless you take all possible steps to quiesce update activity on a volume before it is split (with regular or consistent split), the possibility exists that the data sets or volume may not be usable when restored, unless the data or application is designed to tolerate such partially completed I/O (as described above for DBMS).

You should not use Consistent Backup on volumes containing system data, such as:

Catalogs
Tape management data
Security data
JES2/JES3 spool or checkpoint data
Coupling Facility data sets
Paging data sets
Data used by other products that communicate between systems such as cross-system enqueue products

Using FDRINSTANT Consistent Backup on such volumes may result in interlocks and system failures. FDRINSTANT holds a hardware RESERVE on all volumes involved in the split, but it may need to invoke system services or cause paging that will require normal access to such data sets. So if they are involved in the Consistent Backup, the results can be serious or fatal to your system.

ShadowImage Consistent Split only works on volumes in a single Hitachi subsystem, so you must insure that all volumes that must be split at the same consistent point-in-time are all mounted in the same subsystem.

Defining Consistency Groups

Before you can use ShadowImage Consistent Split, you must reserve one or more Consistency Groups in each Hitachi subsystem involved and you must assign volumes to the consistency groups.

Consistency Groups are reserved using the Hitachi Storage Navigator. Storage Navigator is an Internet browser application that communicates with the code running in your Hitachi subsystem. Please consult the appropriate Hitachi Storage Navigator manual for instructions for setting up and running Storage Navigator. Please consult the appropriate Hitachi ShadowImage Guide for instructions on defining Consistency Groups using Storage Navigator.

Consistency Groups (CTGs) have an ID from 000-127 (00-7F in hex) within each Hitachi subsystem. Although it is possible to reuse the same Consistency Group numbers in different subsystems, we recommend that you keep them unique across your entire system, to avoid confusion.

In the Storage Navigator CTG screen, a consistency group will have a status of “Free” if it is not in use, “Reserved” if it is in use but no volumes have been assigned, and “Used” if volumes have been assigned to the group. To create a new group, select a group that is currently “Free” and perform the “Add CTG” function to change it to “Reserved”.

To add volumes to a Consistency Group, you must use the TSO CESTPAIR or ICKDSF ESTPAIR commands with special values in the PRI operand. Replace the “serial number” (the second sub-operand of PRI) with “MAxx0” where “xx” is the CTG number in hex (00-7F). For example, to assign a ShadowImage pair to Consistency Group 03, use:

CESTPAIR DEVN(X'0C84') PRI(X'0080',MA030,X'04',X'01') – SEC(X'0080',30158,X'1A',X'01')

or:

PPRCOPY ESTPAIR DDNAME(DISK1) PRI(X'0080',MA030,X'04') – SEC(X'0080',30158,X'1A') LSS(X'01',X'01')

(Remember that the LSS parameters, X’01’ in this example, are used only if your subsystem is in 2105 emulation mode).

As you execute these commands, the Hitachi subsystem will assign the volume pairs to the Consistency Group. Once every required volume has been added to the group and the pairs are in Duplex status (fully mirrored), you are ready to do Consistent Backups on that group.

Restrictions and Considerations

You must define a separate Consistency Group number for each set of volumes that you intend to Consistently Split (split at a consistent point-in-time).
All volumes involved in a ShadowImage Consistent Split must be in the same Hitachi DASD subsystem.
FDRABR splits the entire Consistency Group even if your ABR MOUNT statements do not include all volumes in the group in ABR CONPSPLIT step. If you remove a volume from an FDRABR Consistent Split, you must delete the volume from the associated Consistency Group (by doing a TSO CDELPAIR or ICKSDF DELPAIR command). If you accidentally omit a volume in the group from the MOUNT statements, the volume is split but it is not backed up nor re-synced after the backup.
CONPSPLIT issues the Consistent Split request to one volume among the volumes specified on the MOUNT statements, giving the CTG number you specify. If that volume is not in that group, the split fails. If it is in the group, all volumes in the group are Consistently Split at the same time. However, ABR cannot verify that all the volumes on your MOUNT statements are in the same group. It is your responsibility to insure that the volumes in the group and the volumes on the MOUNT statements match.
You cannot use Quick Split in a CONPSPLIT step. All the volumes involved in the Consistent Split must have been previously established as ShadowImage pairs as shown above.

Using Consistent Backup with ABR

In order to use Consistent Backup with FDRABR, you use essentially the same FDRABR procedures that are documented for normal PSPLIT in ShadowImage-for-ABR. However, the main statement in the split step will specify:

CONPSPLIT

Instead of PSPLIT as the operation name.

CTG=nnn

Parameter to identify the Consistency Group number that contains the volumes to be split in this step.

Only one Consistency Group can be processed per job step invocation of the CONPSPLIT function. Do not mix non-Consistent Backup requests with Consistent Backup requests in the same step.

The MOUNT statements in the CONPSPLIT step must identify all of the volumes that must be split at the same consistent point-in-time, and they must all be in the same Hitachi DASD subsystem. Volumes that are part of the Consistency Group, but not identified will still be split, but not processed by ABR. We recommend that you provide one MOUNT per volume involved, including the PPRCUNIT= operand on each MOUNT so that ABR knows the UCB device number of the ShadowImage volume associated with each online volume; this will speed up the CONPSPLIT processing.

However, it is possible to omit the PPRCUNIT= operand, which allows you to specify multiple volumes (for example, VOLG=DB2) or SMS storage groups (for example, STORGRP=DBASE1) on the MOUNT statements. Be aware that this may cause the CONPSPLIT to take longer while ABR determines the UCB device number of each ShadowImage volume, especially the first time that CONPSPLIT is run after an IPL (after the first run, CONPSPLIT will store the addresses for later use).

The CONPSPLIT step operates in two phases:

SPLIT phase – By default, FDRABR first obtains a SYSVTOC RESERVE on each volume specified; this is to inhibit I/O to the volume from other systems and to insure against partially completed VTOC and VTOCIX updates from all systems. However, if SYSVTOC RESERVEs are being suppressed (for example, by the GRS Reserve Conversion RNL), FDRABR does a SYSVTOC enqueue instead. Finally, it issues the Consistent Split request to one volume in the group (which causes all volumes to be split at once) and allows I/O to resume. However, the RESERVE or enqueue is still held.
ABR phase – At this point, ABR begins its normal processing of each volume, identifying the data sets on each volume that to be backed up (depending on whether this is a full-volume or incremental backup) and marking them in the VTOC of the online volume. As it completes each volume, it releases the SYSVTOC RESERVE or enqueue. This usually takes only a few seconds per volume, although it may take longer on volumes with especially large VTOCs or VVDSs or a large number of data sets. ABR uses an internal subtask for each volume, and CONPSPLIT has been enhanced to use up to 32 subtasks to do this phase of processing on up to 32 volumes at a time.

The CONPSPLIT step must be followed by a regular ABR DUMP step, specifying PPRC= on the DUMP statement, as shown in the CONSPLIT Example and in ShadowImage for ABR.

Additional CONSPLIT operands

There are some additional operands that can optionally be specified in a CONPSPLIT step. None of them are required and you rarely need to use them unless instructed to do so by BMC Support.

These operands can be added to the CONPSPLIT statement:

MAXTASKS=

Specifies the number of volumes that are processed concurrently during the ABR processing phase of CONPSPLIT.This does not affect the total number of volumes that may be involved in the Consistent Split; all volumes specified on the MOUNT statements in the CONPSPLIT step are split at the same time.

Default: 32 (and the maximum).

RESVTIMEOUT=

This specifies the time (in minutes) that FDRINSTANT for FDRABR waits to acquire the SYSVTOC reserve on each volume involved in the Consistent Split. If it cannot acquire the reserve within this time, it releases all reserves that it has already acquired on other volumes and starts again to acquire all the reserves. This is to avoid interlocks with other systems because of the hardware reserves.

Default: 1 minute.

RESVTIMEOUT#=

This is the maximum number of times that FDRINSTANT for FDRABR releases the reserves and try to acquire them again (see “RESVTIMEOUT=”). If it cannot acquire the reserve on every volume in the Consistent Split (except those for which RESENQ=NONE was specified) with this number of attempts, it fails the CONPSPLIT operation.

Default: 10.

Additional MOUNT operands

This operand can be added to any MOUNT statement in the CONPSPLIT step:

RESENQ=NONE

This suppresses the SYSVTOC reserve and enqueue for the volumes on that MOUNT. However, if the reserve or enqueue is suppressed, then FDRINSTANT for FDRABR cannot protect against partially completed VTOC updates done by other systems, which may result in corrupted VTOCs when the backup is restored. RESENQ=NONE is intended for use for volumes where there is little or no update activity to the VTOC and other data sets, when a reserve could result in errors or interlocks. Do not use this operand unless you understand its implications; contact BMC Support for guidance.

CONSPLIT example

Here is an example similar to those in ShadowImage-for-ABR, modified to show CONPSPLIT. This shows a full-volume ABR backup, but it can also be used with ABR incremental backups (TYPE=ABR, AUTO, or DSF). It uses Consistency Group 02, which must have been previously defined in the Hitachi subsystem.

Step FULL1 consistently splits the specified ShadowImage volumes from their online volumes to create point-in-time images and mark the backups as complete. It creates a new ABR backup generation and updates the online volumes with information about the new backups. Although all the volumes specified by the MOUNT statements are split in one consistent split operation in the split phase, the ABR processing phase does 32 volumes concurrently, in order to reduce the elapsed time of the CONPSPLIT step. The elapsed time of the step depends on the size of the VTOC and VVDS and the number of data sets on the volume. As soon as this step completes, the point-in-time backup is complete. ENQERR=NO is specified because the data sets to be dumped (for example, database files) are likely to be in use at the time of the CONPSPLIT.

//FULL1 EXEC PGM=FDRABR,REGION=0M //SYSPRINT DD SYSOUT=* //SYSPRIN1 DD SYSOUT=* //FDRSUMM DD SYSOUT=* //SYSUDUMP DD SYSOUT=* //TAPE1 DD DUMMY,RETPD=60 //SYSIN DD * CONPSPLIT TYPE=FDR,ENQERR=NO,RTC=YES,CTG=02 MOUNT VOL=DB2A01,PPRCUNIT=07C0 MOUNT VOL=DB2A02,PPRCUNIT=07C1 MOUNT VOL=DB2A03,PPRCUNIT=07C2 MOUNT VOL=DB2A04,PPRCUNIT=07C3 … … …

Step FULL2 does the actual full-volume backups. PPRC=(USE,RET) tells ABR to determine if a split point-in-time image exists for each volume processed; if so, that image is backed up instead of the online volume. ABR remembers the addresses of the ShadowImage volumes split in step FULL1. Each ShadowImage is RE-ESTABLISHed as soon as its backup is complete.

//FULL2 EXEC PGM=FDRABR,REGION=0M //SYSUDUMP DD SYSOUT=* //SYSPRINT DD SYSOUT=* //SYSPRIN1 DD SYSOUT=* //FDRSUMM DD SYSOUT=* //SYSUDUMP DD SYSOUT=* //TAPE1 DD DSN=ABR1,UNIT=CART,DISP=(NEW,KEEP),RETPD=60 //TAPE11 DD DSN=ABR11,UNIT=CART,DISP=(NEW,KEEP),EXPDT=99000 //SYSIN DD * DUMP TYPE=FDR,ENQERR=NO,RTC=YES,PPRC=(USE,RET) MOUNT VOL=DB2A01 MOUNT VOL=DB2A02 MOUNT VOL=DB2A03 MOUNT VOL=DB2A04 …