Key concepts of the recovery process
The recovery process begins after a problem is discovered.
IMS or the database validation program may find a pointer error. A hardware malfunction might physically damage a data set or disable a disk pack filled with IMS database data sets. Someone might discover a logic error in an IMS application program that has improperly updated several databases.
Whatever the reason, the database must be recovered to its exact state before the failure occurred. Because the database must be taken offline for recovery, it becomes unavailable to IMS users. Recovery situations commonly involve the loss of multiple database data sets. This data unavailability can critically affect a business’s success and can cost the company money!
The following types of input data can be used during the recovery process:
- During a full recovery, a copy of the database as it existed prior to the failure is used as a starting point. As a standard procedure, most installations take regular image copies of their databases for recovery purposes. A full recovery also uses records of all changes that have been made to the database since the copy was made. IMS writes these records to its log data sets. Many installations further process their logs with a change accumulation utility.
- For a roll-forward recovery, a copy of the database data sets is used as a starting point, and logged changes are applied to bring the database up to the current time.
- For a backout recovery, the existing database data sets, containing the changes that need to be backed out, are used as the starting point. Logged changes are applied to return the database to the condition it was in at a previous point in time.
Many installations also use the IMS Database Recovery Control (DBRC) feature to keep track of database data sets, image copies, and log and change accumulation data sets. The Recovery utility dynamically allocates the data sets that are involved in the recovery, helping reduce the time required to set up the recovery jobs and ensuring that all the correct data sets are included in the job.
As part of the recovery process, most installations also take an image copy of the recovered database and verify its pointers to ensure the recovered database is valid.
Problems associated with conventional recovery methods
Several problems are currently associated with conventional recovery methods:
- Conventional methods recover one database data set per job step. Although the GENJCL function of DBRC can generate the JCL to recover multiple database data sets, it generates the JCL as a set of job steps that must be run serially. When the elapsed time for synchronous recovery is too great, people must manually create JCL. They usually have to run an additional job to copy the change accumulation data set from tape to disk so the multiple steps can access it concurrently.
- Recovery can be done to a valid recovery time (point) only. Creating periodic recovery points can be disruptive in continuous-availability environments.
- Each recovery job step must read the change accumulation and/or log data set(s). The same data is processed multiple times to recover multiple data sets.
- The utilities that validate pointers and take image copies must be run in a separate job step after the recovery job step(s) complete.
Recovery utility techniques
The Recovery utility addresses current problems and significantly reduces the time and resources needed to recover a database:
- The Recovery utility recovers multiple database data sets with a single pass of the change accumulation data set and system (recovery) log data sets.
- The Recovery utility can recover a database to any point in time (you specify the time to which you want to recover). It is not limited to a valid recovery as defined by DBRC. It can use a point-in-time change accumulation that is produced by the BMC Change Accumulation utility.
- Recovery tasks can be restarted automatically after failure.
- The Recovery utility can create up to 10 image copies during the recovery process with little impact on performance. The additional copies can be registered in the Recovery Manager (RMGR) repository through the Recovery Extensions feature.
- The Recovery utility can automatically handle an Instant Snapshot copy that was created by the BMC Image Copy utility. The SNAPSHOT UPGRADE FEATURE (SUF) component works with intelligent storage devices to restore an Instant Snapshot copy nearly instantaneously.
- It can restore a T0 copy that was created by the Data Base Image Copy 2 utility from IBM (program DFSUDMT0, also referred to as the T0 Copy utility).
- Depending on the product or products you have installed, the Recovery utility can invoke a function to rebuild the primary index and all secondary indexes for a recovered DL/I database.
- Depending on the product or products you have installed, the Recovery utility can invoke the Concurrent Pointer Checking feature for full function databases to validate the pointers in each recovered database.
- Depending on the product or products you have installed, the Recovery utility can invoke the Concurrent Pointer Checking feature for DEDBs to validate the pointers in each recovered DEDB.
- The Recovery utility can create duplicates of databases that can be used for inquiry-only access by offline batch applications. The duplicate databases can also be used for disaster recovery.