IAM/PLEX restart scenarios
IAM/PLEX can be an integral part of your business-critical online and batch application processing. Therefore, it is imperative that if outages occur, you can restart IAM/PLEX as quickly as possible, with the least possible effort, and in an automated manner.
This topic describes how to restart IAM/PLEX after both planned and unplanned outages.
- Planned outages—The IAM/PLEX needs to be stopped and restarted as quickly as possible or possibly needs to be restarted on a separate LPAR. It might be necessary to stop and restart IAM/PLEX on its current LPAR to implement parameter changes or possibly product maintenance. It might also be necessary to stop IAM/PLEX on the current LPAR and restart it on another LPAR to perform system maintenance.
- Unplanned outages—The IAM/PLEX address space abnormally terminates, or for some reason, must be cancelled to alleviate some sort of resource shortage. Also, there might be an LPAR failure resulting in the need for IAM/PLEX to be moved to an alternate LPAR within the Sysplex.
Planned outage recovery
Sometimes you need to stop and restart IAM/PLEX on its current LPAR (for example, to implement parameter changes or product maintenance). Other times, you need to stop IAM/PLEX and restart it on a separate LPAR (for example, to perform system maintenance). In both cases, you need to maintain maximum availability for business-critical applications.
Planned outage recovery for online jobs
You can use the QUIESCE,RESTART operator command to quiesce all IAM file access, stop IAM/PLEX, and restart it on the same LPAR or a different LPAR within the sysplex. To restart it on a different LPAR, you must add a target LPAR command option.
The QUIESCE,RESTART command works only for started task control (STC)–type address spaces, not for job-type address spaces.
The QUIESCE,RESTART command starts the new IAM/PLEX address space and internally issues a z/OS STOP (P) command to its own address space. This causes online CICS transactional activity to suspend processing of IAM files. CICS disconnects from the IAM files and when the new IAM/PLEX address space is restarted, it reconnects automatically. For more information about CICS automatic disconnect and reconnect processing, see IAM-RLS-CICS-Considerations.
Alternatively, you can use the hot standby recovery. For more information, see Hot standby recovery.
Planned outage recovery for batch jobs
Your ability to recover and restart a DBMS or file system after an interruption depends on the batch application program's ability to take regular checkpoints and implement proper restart functionality. Many batch application programs do not have any built-in checkpoint or restart capability but the BMC AMI Application Restart Control for VSAM (ARC) product provides this functionality without batch application program changes. IAM/PLEX is integrated with ARC and suspends ARC-enabled batch jobs when the QUIESCE,RESTART command is issued. When the IAM/PLEX address space is restarted, it resumes the previously suspended ARC-enabled batch jobs.
If you do not use ARC with your batch jobs, you must allow the batch job to complete or reach its own natural termination point. You can also cancel the batch job to force recovery of IAM file after the IAM/PLEX address space is restarted with the batch job.
IAM/PLEX supports a full IAM/RLS sysplex environment. If you cancel your batch jobs (IAM/PLEX does not do this automatically), IAM/RLS recovers any held locks and performs any necessary backouts.
For more information about automatic IAM file recovery, see IAM-RLS-Journaling and IAM-RLS-record-lock-recovery.
Unplanned outage recovery
Setting the PLEXRESTART=YES registers with the z/OS Automatic Restart Manager (ARM). ARM supports two types of unplanned outages: LPAR failure and address space failure.
If an LPAR has failed, ARM attempts to restart the IAM/PLEX address space on one of the TARGET_SYSTEM LPARs. You must prepare the LPAR to run an IAM/PLEX address space by starting IAMMAIN on any target LPARs. For more information, see Security-license-management-with-IAMMAIN. If you specify the NOVIF execution parameter on IAMMAIN, make sure that VIFSTART has run on any target LPARs. For more information about the VIFSTART job, see Activating-the-IAM-VSAM-interface.
If an IAM/PLEX address has failed, ARM attempts to restart it on the same LPAR on which it was running only if that LPAR is a member of the sysplex.
The default ARM policy does not allow the IAM/PLEX address space to restart, so your z/OS system administrator should create a new ARM policy
Creating an ARM policy
IAM/PLEX registers with ARM with a member name of $IAMxxxxxxxxyyyy, where xxxxxxxx is the RLSGROUP name of the IAM/PLEX and yyyy is the RLSID of the IAM/PLEX address space.
A sample IAM/PLEX ARM policy follows:
RESTART_GROUP(IAMGROUP)
TARGET_SYSTEM(SYSB,SYSC)
ELEMENT($IAM*) ç IAM/PLEX Element Name begins with $IAM
RESTART_METHOD(BOTH,PERSIST)
RESTART_ATTEMPTS(3,300)
For more information about IBM automatic restart management parameters, see the IBM documentation.
IAM/PLEX ARM event exit
IAM/PLEX uses of the ARM event exit; IAMBEVNT. This exit is loaded into the link pack area (LPA) by using the VIFSTART process. You must run VIFSTART on any LPAR on which you want IAM/PLEX to restart. You can run the VIFSTART job or remove the NOVIF execution parameter from the IAMMAIN job that must be running on the target LPAR.
The IAM/PLEX ARM event exit provides an additional layer of user control over where an IAM/PLEX address space should restart. Using the TOSYSID parameter, you can specify a target system to restart the IAM/PLEX address space overriding the ARM policy TARGET_SYSTEM specification.
The IAM/PLEX ARM event exit always allows the IAM/PLEX address space to restart on the same LPAR on which it was running when the address space failed. The TOSYSID parameter alters the target system if an LPAR fails.
Hot standby recovery
With IAM/PLEX there is an option to start a Hot Standby IAM/PLEX address space. The Hot Standby monitors the health of one of the IAM/PLEX address spaces. The Hot Standby is started on a different LPAR from the IAM/PLEX that it is monitoring.
When the Hot Standby detects that the IAM/PLEX address space is no longer up and running either from a Planned or Unplanned LPAR outage or IAM/PLEX address space outage, it immediately takes over as the new IAM/PLEX address space.
The IAM/PLEX address space and the Hot Standby IAM/PLEX address space uses the same RLSID.