Overview of ARMBSRR-generated jobs
ARMBSRR generates the following sets of jobs to perform a conditional restart recovery of a Db2 subsystem or data sharing object set:
- Phase 1 jobs—run while Db2 is down
- Phase 2 jobs—run after Db2 is restarted in MAINT mode.
- Data collection jobs--run after application data recovery (Recovery Management for Db2 solution only)
The jobs generated by ARMBSRR are written to a single PDS member or sequential file. ARMBSRR requires the job card to contain the symbolic variable &## to allow it to number the jobs it creates. The job card that ARMBSRR uses in the generated JCL is specified via the ARMJCIN DD statement.
Each job contains a comment to indicate whether it is a Phase 1, Phase 2, or data collection job and what its job number is within that phase. An example follows:
About Phase 1
There is at least one job per subsystem in Phase 1.
For data sharing there is at least one job per member. The jobs are numbered 1 to n, where
- n is the number of members in the data sharing object set
- 1 indicates a non-data-sharing environment
If you specify MAXLOGJOBS greater than one, BMC AMI Recovery Manager generates additional jobs for each subsystem to provide for parallel log copies to disk. The log copy jobs are numbered sequentially beginning with (n + 1). A maximum of 32 total jobs is allowed.
When the JCL is submitted, the Phase 1 jobs begin executing immediately. If copying logs to disk, additional jobs are submitted to the internal reader at the end of the initial Phase 1 jobs. A Phase 2 job is placed on hold while the Phase 1 jobs execute. If you used the local subsystem recovery option to generate application recovery JCL as well, there is a second job 01 on hold that will be used to create application recovery JCL.
If you are using the Recovery Management for Db2 solution, a data collection job is also placed on hold. Also, data collection information is written to a flat file during Phase 1 processing.
Example - Phase 1 execution
To illustrate, assume that you have a two-member data sharing system, MAXLOGJOBS 3, and a job name of BMCBSR&##. When the JCL is submitted, you see jobs 01 and 02 begin executing immediately. You also see a job 01 on hold--this is the first Phase 2 job. As one of its final steps, job 01 submits jobs 03 and 04 to copy logs for member 1. Job 01 itself also copies some of the logs, resulting in a total of 3 jobs that copy logs for member 1. Job 02 submits jobs 05 and 06 to copy some of the logs for member 2, and job 02 itself copies the remainder.
No synchronization between Phase 1 jobs is required. The only requirement is that they must all complete before starting Db2 in MAINT MODE and before releasing the Phase 2 job that is on hold.
The following figure shows the Phase 1 execution (2 member data sharing, MAXLOGJOBS=3):
About Phase 2
When Phase 1 jobs are complete, follow the instructions in the JCL for clearing the Coupling Facility for data sharing and starting Db2. You may then release the Phase 2 job to begin executing.
If only a single job is needed by Phase 2, it executes immediately.
Phase 2 is performed by:
Multiple jobs for data sharing
For data sharing environments, there is at least one job per member.
One job with multiple tasks, for Db2 Version 10 and later, when you specify a value for MAXCATJOBS greater than one
BMC AMI Recovery Manager uses the value of MAXCATJOBS for PARALLEL and TAPEUNITS to perform multiple tasks in one job.
When multiple jobs are required for Phase 2, the first job that executes is the one that was placed on hold initially during Phase 1. It allocates a synchronization file that is used by the subsequent Phase 2 recovery jobs to monitor and synchronize the work between jobs. The first job then submits the actual Phase 2 recovery jobs. Once it has submitted the other jobs, it ends.
The first action of the Phase 2 first job is to submit a synchronization cleanup job also named 01. The synchronization cleanup job runs after Phase 2 recovery job 01 completes. If all jobs run successfully, the cleanup job then deletes the synchronization file. For data sharing object sets, a Phase 2 job executes for each member and is routed to the system on which its corresponding member ran at the local site. There may also be additional jobs for catalog recovery as previously described. These jobs utilize the synchronization program and wait to execute at the appropriate time in the process.
If Phase 2 completes successfully, a Db2 STOP command is issued. You then start Db2 for normal access to begin the application recovery process.
At this point, if you have a job to generate application recovery JCL (ARMBGEN) on hold, you should release it when the Db2 start has completed successfully. Generating recovery JCL at this point is expected for Full Subsystem Local Recovery. (Disaster recovery procedures typically generate the JCL at the local site as part of the preparation process.)
If you are using the Recovery Management for Db2 solution, the Phase 2 jobs should all complete before you release the data collection job. Also, data collection information is written to data collection tables during Phase 2 processing.
The following figure shows the Phase 2 execution:
About data collection jobs
For the Recovery Management for Db2 solution only, data is collected about the recoveries throughout the disaster recovery process.
During Phase 1, the data about the system resource recoveries is written to a flat file. During Phase 2, the data is written to the data collection tables. After all application data is recovered, the data collection jobs run. These jobs consolidate all data into the tables and create a file of SQL statements that you can use to populate the data collection tables at the local site. For more information, see the Recovery Management for Db2 documentation.
The following figure shows the Data collection:
Related topics