Overview of ARMBSRR-generated jobs


ARMBSRR generates the following sets of jobs to perform a conditional restart recovery of a Db2 subsystem or data sharing object set:

  • Phase 1 jobs—run while Db2 is down
  • Phase 2 jobs—run after Db2 is restarted in MAINT mode.
  • Data collection jobs--run after application data recovery (Recovery Management for Db2 solution only)

The jobs generated by ARMBSRR are written to a single PDS member or sequential file. ARMBSRR requires the job card to contain the symbolic variable &## to allow it to number the jobs it creates. The job card that ARMBSRR uses in the generated JCL is specified via the ARMJCIN DD statement.

Each job contains a comment to indicate whether it is a Phase 1, Phase 2, or data collection job and what its job number is within that phase. An example follows:

//* DISASTER RECOVERY FOR SYSTEM RESOURCES - PHASE 2 JOB 2  *//

Important

If you are performing a recovery simulation, only a portion of the Phase 1 job set runs and Phase 2 does not run at all. If you are performing a recovery estimation, the ARMBWDC and ARMBRDC data collection jobs are run. You release the ARMBRDC job from HOLD after the successful completion of the application recovery. Both recovery simulation and estimation are features of the Recovery Management for Db2 solution. For more information, see the Recovery Management for Db2 documentation

About Phase 1

There is at least one job per subsystem in Phase 1.

For data sharing, there is at least one job per member. The jobs are numbered 1 to n, where

  • n is the number of members in the data sharing object set
  • 1 indicates a non-data-sharing environment

If you specify MAXLOGJOBS greater than one, BMC AMI Recovery Manager generates additional jobs for each subsystem to provide for parallel log copies to disk. The log copy jobs are numbered sequentially beginning with (n + 1). A maximum of 32 total jobs is allowed.

Important

BMC AMI Recovery Manager performs stacked tape analysis prior to creating the Phase 1 JCL. The number of log copy jobs may vary based on this analysis and may possibly be less than you requested with the MAXLOGJOBS option.

When the JCL is submitted, the Phase 1 jobs begin executing immediately. If copying logs to disk, additional jobs are submitted to the internal reader at the end of the initial Phase 1 jobs. A Phase 2 job is placed on hold while the Phase 1 jobs execute. If you used the local subsystem recovery option to generate application recovery JCL as well, there is a second job 01 on hold that will be used to create application recovery JCL.

If you are using the Recovery Management for Db2 solution, a data collection job is also placed on hold. Also, data collection information is written to a flat file during Phase 1 processing.

Example - Phase 1 execution

To illustrate, assume that you have a two-member data sharing system, MAXLOGJOBS 3, and a job name of BMCBSR&##. When the JCL is submitted, you see jobs 01 and 02 begin executing immediately. You also see job 01 on hold--this is the first Phase 2 job. As one of its final steps, job 01 submits jobs 03 and 04 to copy logs for member 1. Job 01 itself also copies some of the logs, resulting in a total of 3 jobs that copy logs for member 1. Job 02 submits jobs 05 and 06 to copy some of the logs for member 2, and job 02 itself copies the remainder.

No synchronization between Phase 1 jobs is required. The only requirement is that they must all be complete before starting Db2 in MAINT MODE and before releasing the Phase 2 job that is on hold.

The following figure shows the Phase 1 execution (2 member data sharing, MAXLOGJOBS=3):

GUID-5D6BBEA8-0B61-4DB4-95EB-ACF02BC28884-low.png

About Phase 2

When Phase 1 jobs are complete, follow the instructions in the JCL for clearing the Coupling Facility for data sharing and starting Db2. You may then release the Phase 2 job to begin executing.

If only a single job is needed by Phase 2, it executes immediately.

Phase 2 is performed by:

  • Multiple jobs for data sharing

    For data-sharing environments, there is at least one job per member.

  • For one job with multiple tasks, when you specify a value for MAXCATJOBS greater than one

    BMC AMI Recovery Manager uses the value of MAXCATJOBS for PARALLEL and TAPEUNITS to perform multiple tasks in one job.

Example
  • A two-member data sharing object set has at least two jobs (one for each member).
  • A non-data sharing system with MAXCATJOBS=3 has one job that performs three tasks.

Important

Some conditions can prevent concurrent jobs for catalog recovery such as stacked tape.

When multiple jobs are required for Phase 2, the first job that executes is the one that was placed on hold initially during Phase 1. It allocates a synchronization file that is used by the subsequent Phase 2 recovery jobs to monitor and synchronize the work between jobs. The first job then submits the actual Phase 2 recovery jobs. Once it has submitted the other jobs, it ends.

The first action of Phase 2 first job is to submit a synchronization cleanup job also named 01. The synchronization cleanup job runs after Phase 2 recovery job 01 completes. If all jobs run successfully, the cleanup job then deletes the synchronization file. For data sharing object sets, a Phase 2 job executes for each member and is routed to the system on which its corresponding member ran at the local site. There may also be additional jobs for catalog recovery as previously described. These jobs utilize the synchronization program and wait to execute at the appropriate time in the process.

Important

Note that the SYSAFF= needs to be changed for JES3 or if the members are run in a different system configuration than the local site.

If Phase 2 completes successfully, a Db2 STOP command is issued. You then start Db2 for normal access to begin the application recovery process.

At this point, if you have a job to generate application recovery JCL (ARMBGEN) on hold, you should release it when the Db2 start has been completed successfully. Generating recovery JCL at this point is expected for Full Subsystem Local Recovery. (Disaster recovery procedures typically generate the JCL at the local site as part of the preparation process.)

If you are using the Recovery Management for Db2 solution, the Phase 2 jobs should all be completed before you release the data collection job. Also, data collection information is written to data collection tables during Phase 2 processing.

The following figure shows the Phase 2 execution:

GUID-6F08CADC-102A-418C-ADA9-63ECEF79B794-low.png

About data collection jobs

For the Recovery Management for Db2 solution only, data is collected about the recoveries throughout the disaster recovery process.

During Phase 1, the data about the system resource recoveries are written to a flat file. During Phase 2, the data is written to the data collection tables. After all application data is recovered, the data collection jobs run. These jobs consolidate all data into the tables and create a file of SQL statements that you can use to populate the data collection tables at the local site. For more information, see the Recovery Management for Db2 documentation

The following figure shows the Data collection:

GUID-7B8E189F-CEA1-4257-AE01-AEC329B87B90-low.png



 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*