Scenario 1: Why did NITEBAT finish so late?
The job NITEBAT finished well past its scheduled completion time last night.
As a result, activity in several areas of the company has been delayed. It is your job to figure out why this delay happened and, more importantly, to prevent it from happening again.
NITEBAT was supposed to finish at 1:20 A.M. this morning. Your first step is to look at the system as it existed at 1:20 A.M. and begin gathering clues.
To get the NITEBAT information
On the COMMAND line, type the TIMEcommand for window 1 (using the format mm/dd/yyyy):TIME 11/10/YYYY 01:20:00
Until you specify otherwise, all views displayed in window 1 automatically retrieve data from the historical database for the interval between 1:15 and 1:30 A.M. (the interval containing 1:20).
You know for certain that NITEBAT experienced considerable delay last night, but you want to determine whether any other workloads were delayed.
On the COMMAND line, type WDELAY.The WDELAY view is displayed, as shown in Figure 1.
Figure 1. WDELAY view
DDMMMYYYY HH:MM:SS ------ MainView WINDOW INTERFACE (Vv.r.mm) ----------------
COMMAND ===> SCROLL ===> PAGE
CURR WIN ===> 1 ALT WIN ===>
>H1 =WDELAY============SYSE=====*========DDMMMYYYY==HH:MM:SS====CMF======D===34
C Workload T #AS Total Delay% %Dly %Dly %Dly %Dly %Dly %Dly %Dly
- -------- - --- 0....50...100 CPU Dev Stor ENQ SRM Subs Idle
BATJOBS B 4 83.01 *********** 2.5 4.2 1.3 75.0
PGRP0030 P 21 22.76 *** 21.6 1.2 0.1 1.6
ALLSTC S 84 16.19 ** 13.3 3.0 0.1 0.4
ALLWKLDS C 90 1.11 0.7 0.4
ALLBAT B 1
TEMPCMP C 8
JCBATCH B
ALLOMVS O
PAYBAT B 1
PAYROLL C 6
PAYTSO T 5
PGRP0041 P
PGRP0000 P 30Scanning the Total Delay% column, you discover that no workload was as critically delayed as BATJOBS, the workload containing NITEBAT. BATJOBS spent 83 percent of the interval waiting for one or more resources. Of the total delay, 75 percent was due to enqueue contention. How much of this delay was experienced by NITEBAT in particular? Were other jobs in BATJOBS affected by enqueue delay as well?
To answer these questions, you can type JDELAY on the COMMAND line, or you can rely on CMF MONITOR Online predefined hyperlinks to anticipate your information needs. You decide to rely on the predefined hyperlinks.
Position your cursor in the %Dly ENQ field for BATJOBS and press Enter.CMF MONITOR Online hyperlinks to the JDENQ view, as shown in Figure 2, where you can identify the enqueue resource causing the delay and find out why NITEBAT spent so much time contending for it.
Figure 2. JDENQ view
DDMMMYYYY HH:MM:SS ------ MainView WINDOW INTERFACE (Vv.r.mm) ----------------
COMMAND ===> SCROLL ===> PAGE
CURR WIN ===> 1 ALT WIN ===>
>H1 =JDENQ=============SYSE=====*========DDMMMYYYY==HH:MM:SS====CMF======D====4
C Waiting JES Job T SrvClass %Delay %Delay Wait Major Minor RName EN
- Job----- Number - -------- This Enq All Enq Want QName--- -------------- St
MV50CAST STC07172 S SLOW 1.22 Excl SYSZTIOT = / En
MTADOM01 STC07219 S STCNRM 3.26 Excl SYSDSN LGS1.CNTL En
LGS11Q1 STC07177 S SLOW 3.26 Excl SYSDSN LGS1.CNTL En
NITEBAT STC07093 S STCNRM 100.0 Excl SYSDSN SYS.MCS.MCS EnThe Waiting Job column tells you that NITEBAT is waiting for the logical enqueue resource that is identified by the major name SYSDSN, indicating that the resource is a data set. The minor name, SYS.MCS.MCS, is the name of the data set itself. And if you scrolled to the right by using PF11, you would see from the Owning Job column that a job called DDBBKUP currently owns the resource.
To find out more about this job, position your cursor under Minor RName and press Enter to display the JUENQ view, as shown in Figure 3.
Figure 3. JUENQ view
DDMMMYYYY HH:MM:SS ------ MainView WINDOW INTERFACE (Vv.r.mm) ----------------
COMMAND ===> SCROLL ===> PAGE
CURR WIN ===> 1 ALT WIN ===>
>H1 =JDENQ====JUENQ====SYSE=====*========DDMMMYYYY==HH:MM:SS====CMF======D====1
C Owning JES Job T SrvClass %Use Ownr Major Minor RName ENQ Waiting
- Job----- Number - -------- ENQ Has- QName--- -------------- Status Job----
DDBBKUP STC07093 S STCNRM 97.2 Excl SYSDSN SYS.MCS.MCS Ended NITEBATThere is the problem. DDBBKUP has been assigned exclusive (Excl) use of this enqueue resource, holding it for 97 percent during the 1:15 to 1:30 A.M. interval. All other jobs, including NITEBAT, are restricted from this resource until DDBBKUP completes execution.
Now that you know what caused last night’s delay, you are in position to ensure that it does not happen again. One solution is to reschedule DDBBKUP so that it runs after NITEBAT has been completed (although your site might prefer an alternative method).