IAM/PLEX restart scenarios


IAM/PLEX can be an integral part of your business-critical online and batch application processing. Therefore, it is imperative that if outages occur, you can restart IAM/PLEX as quickly as possible, with the least possible effort, and in an automated manner.

This topic describes how to restart IAM/PLEX after both planned and unplanned outages.

  • Planned outages—The IAM/PLEX needs to be stopped and restarted as quickly as possible or possibly needs to be restarted on a separate LPAR. It might be necessary to stop and restart IAM/PLEX on its current LPAR to implement parameter changes or possibly product maintenance. It might also be necessary to stop IAM/PLEX on the current LPAR and restart it on another LPAR to perform system maintenance. 
  • Unplanned outages—The IAM/PLEX address space abnormally terminates, or for some reason, must be cancelled to alleviate some sort of resource shortage. Also, there might be an LPAR failure resulting in the need for IAM/PLEX to be moved to an alternate LPAR within the Sysplex.

Important

  • IAM/PLEX supports 16 address spaces in a RLSGROUP, which includes both FILESERVER/TARGET and ROUTER.
  • Enable the KEEPALIVE parameter for ARC enablement and the PLEXRESTART parameter for ARM through IAMPLEX parameters or Global Options.

Planned outage recovery

Sometimes you need to stop and restart IAM/PLEX on its current LPAR (for example, to implement parameter changes or product maintenance). Other times, you need to stop IAM/PLEX and restart it on a separate LPAR (for example, to perform system maintenance). In both cases, you need to maintain maximum availability for business-critical applications.

Planned outage recovery for online jobs

You can use the QUIESCE,RESTART operator command to quiesce all IAM file access, stop IAM/PLEX, and restart it on the same LPAR or a different LPAR within the sysplex. To restart it on a different LPAR, you must add a target LPAR command option.

The QUIESCE,RESTART command works only for started task control (STC)–type address spaces, not for job-type address spaces.

The QUIESCE,RESTART command starts the new IAM/PLEX address space and internally issues a z/OS STOP (P) command to its own address space. This causes online CICS transactional activity to suspend processing of IAM files. CICS disconnects from the IAM files and when the new IAM/PLEX address space is restarted, it reconnects automatically.  For more information about CICS automatic disconnect and reconnect processing, see IAM-RLS-CICS-Considerations.

Alternatively, you can use the hot standby recovery. For more information, see Hot standby recovery.

Important

All online IAM data sets must close before or after issuing the QUIESCE,RESTART command. You can close these data sets manually from CICS or by issuing the QUIESCE,FORCE command. 

If required, run QUIESCE,FORCE command on routers first and then on TARGET/FILSERVER address space.

Planned outage recovery for batch jobs

Your ability to recover and restart a DBMS or file system after an interruption depends on the batch application program's ability to take regular checkpoints and implement proper restart functionality. Many batch application programs do not have any built-in checkpoint or restart capability but the BMC AMI Application Restart Control for VSAM (ARC) product provides this functionality without batch application program changes. IAM/PLEX is integrated with ARC and suspends ARC-enabled batch jobs when the QUIESCE,RESTART command is issued. When the IAM/PLEX address space is restarted, it resumes the previously suspended ARC-enabled batch jobs.

Important

You should not initiate new batch jobs when QUIESCE,RESTART command is in progress.

If required, run QUIESCE,FORCE command on routers first and then on TARGET/FILSERVER address space.

If you do not use ARC with your batch jobs, you must allow the batch job to complete or reach its own natural termination point. You can also cancel the batch job to force recovery of IAM file  after the IAM/PLEX address space is restarted with the batch job. 

IAM/PLEX supports a full IAM/RLS sysplex environment. If you cancel your batch jobs (IAM/PLEX does not do this automatically), IAM/RLS recovers any held locks and performs any necessary backouts.

For more information about automatic IAM file recovery, see IAM-RLS-Journaling and IAM-RLS-record-lock-recovery.

Important

  • If you use ARC for restart logic in an application and the QUIESCE,RESTART command is issued, then the PAUSE command is automatically issued to all ARC-enabled jobs. When the new IAMPLEX comes up, the PROCEED command is issued for all ARC-enabled jobs.
  • For non-ARC jobs, if the QUIESCE,RESTART command is issued while they are running, the old IAM/PLEX terminates and the new IAM/PLEX comes up only after all jobs are completed. The QUIESCE,FORCE command is also issued, but the jobs fail.
  • IAM RRDS data sets does not support ARC PAUSE and PROCEED functionality to restart the batch jobs.

Unplanned outage recovery

Setting the PLEXRESTART=YES registers with the z/OS Automatic Restart Manager (ARM). ARM supports two types of unplanned outages: LPAR failure and address space failure

If an LPAR has failed, ARM attempts to restart the IAM/PLEX address space on one of the TARGET_SYSTEM LPARs. You must prepare the LPAR to run an IAM/PLEX address space by starting IAMMAIN on any target LPARs. For more information, see Security-license-management-with-IAMMAIN. If you specify the NOVIF execution parameter on IAMMAIN, make sure that VIFSTART has run on any target LPARs. For more information about the VIFSTART job, see Activating-the-IAM-VSAM-interface

If an IAM/PLEX address has failed, ARM attempts to restart it on the same LPAR on which it was running only if that LPAR is a member of the sysplex. 

The default ARM policy does not allow the IAM/PLEX address space to restart, so your z/OS system administrator should create a new ARM policy

Creating an ARM policy

IAM/PLEX registers with ARM with a member name of $IAMxxxxxxxxyyyy, where xxxxxxxx is the RLSGROUP name of the IAM/PLEX and yyyy is the RLSID of the IAM/PLEX address space. 

A sample IAM/PLEX ARM policy follows:

DEFINE POLICY NAME(CIAMPOL) REPLACE(YES)

RESTART_GROUP(IAMGROUP)

TARGET_SYSTEM(SYSB,SYSC)

ELEMENT($IAM*)                                ç  IAM/PLEX Element Name begins with $IAM

RESTART_METHOD(BOTH,PERSIST)

RESTART_ATTEMPTS(3,300) 

For more information about IBM automatic restart management parameters, see the IBM documentation.

IAM/PLEX ARM event exit

IAM/PLEX uses of the ARM event exit; IAMBEVNT. This exit is loaded into the link pack area (LPA) by using the VIFSTART process. You must run VIFSTART on any LPAR on which you want IAM/PLEX to restart. You can run the VIFSTART job or remove the NOVIF execution parameter from the IAMMAIN job that must be running on the target LPAR. 

The IAM/PLEX ARM event exit provides an additional layer of user control over where an IAM/PLEX address space should restart. Using the TOSYSID parameter, you can specify a target system to restart the IAM/PLEX address space  overriding the ARM policy TARGET_SYSTEM specification. 

The IAM/PLEX ARM event exit always allows the IAM/PLEX address space to restart on the same LPAR on which it was running when the address space failed. The TOSYSID parameter alters the target system if an LPAR fails. 

Important

  • ARM restarts started task control (STC)-type address spaces. It cannot resubmit job-type address spaces.
  • If the address space fails, ARM attempts to restart it on the same LPAR on which it was running when it failed, even if the LPAR is not specified as a TARGET_SYSTEM in the ARM policy.
  • If an LPAR fails, ARM attempts to restart an address space on alternate LPARs listed in the ARM policy TARGET_SYSTEM specification if an LPAR fails. This means that the IAM/PLEX ARM event exit runs on that alternate LPAR.  

Hot standby recovery

With IAM/PLEX there is an option to start a Hot Standby IAM/PLEX address space. The Hot Standby monitors the health of one of the IAM/PLEX address spaces. The Hot Standby is started on a different LPAR from the IAM/PLEX that it is monitoring. 

When the Hot Standby detects that the IAM/PLEX address space is no longer up and running either from a Planned or Unplanned LPAR outage or IAM/PLEX address space outage, it immediately takes over as the new IAM/PLEX address space. 

The IAM/PLEX address space and the Hot Standby IAM/PLEX address space uses the same RLSID.

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*