Performance scenarios


Each scenario in this section opens with a hypothetical performance problem, and then moves through a succession of views until the source of the problem is pinpointed.

Note that these scenarios illustrate only the most common path through CMF MONITOR Online. Depending on your level of expertise, you might choose a different, more sophisticated problem-solving methodology.

Scenario 1: Why did NITEBAT finish so late?

The job NITEBAT finished well past its scheduled completion time last night.

As a result, activity in several areas of the company has been delayed. It is your job to figure out why this delay happened and, more importantly, to prevent it from happening again.

NITEBAT was supposed to finish at 1:20 A.M. this morning. Your first step is to look at the system as it existed at 1:20 A.M. and begin gathering clues.

To get the NITEBAT information

  1. On the COMMAND line, type the TIMEcommand for window 1 (using the format mm/dd/yyyy):TIME 11/10/YYYY 01:20:00

    Until you specify otherwise, all views displayed in window 1 automatically retrieve data from the historical database for the interval between 1:15 and 1:30 A.M. (the interval containing 1:20).

    You know for certain that NITEBAT experienced considerable delay last night, but you want to determine whether any other workloads were delayed.

  2. On the COMMAND line, type WDELAY.The WDELAY view is displayed, as shown in the following figure

    DDMMMYYYY  HH:MM:SS ------ MainView WINDOW INTERFACE (Vv.r.mm) ----------------
    COMMAND  ===>                                                 SCROLL ===> PAGE
    CURR WIN ===> 1        ALT WIN ===>                                            
    >H1 =WDELAY============SYSE=====*========DDMMMYYYY==HH:MM:SS====CMF======D===34
     C Workload T #AS        Total Delay%   %Dly  %Dly  %Dly  %Dly  %Dly  %Dly  %Dly
     - -------- - ---        0....50...100   CPU   Dev  Stor   ENQ   SRM  Subs  Idle
       BATJOBS  B   4  83.01 ***********     2.5   4.2   1.3  75.0                  
       PGRP0030 P  21  22.76 ***            21.6   1.2   0.1   1.6                  
       ALLSTC   S  84  16.19 **             13.3   3.0   0.1   0.4                  
       ALLWKLDS C  90   1.11                       0.7         0.4                  
       ALLBAT   B   1                                                               
       TEMPCMP  C   8                                                               
       JCBATCH  B                                                                   
       ALLOMVS  O                                                                   
       PAYBAT   B   1                                                               
       PAYROLL  C   6                                                               
       PAYTSO   T   5                                                               
       PGRP0041 P                                                                   
       PGRP0000 P  30

    Scanning the Total Delay% column, you discover that no workload was as critically delayed as BATJOBS, the workload containing NITEBAT. BATJOBS spent 83 percent of the interval waiting for one or more resources. Of the total delay, 75 percent was due to enqueue contention. How much of this delay was experienced by NITEBAT in particular? Were other jobs in BATJOBS affected by enqueue delay as well?

    To answer these questions, you can type JDELAY on the COMMAND line, or you can rely on CMF MONITOR Online predefined hyperlinks to anticipate your information needs. You decide to rely on the predefined hyperlinks.

  3. Position your cursor in the %Dly ENQ field for BATJOBS and press Enter.CMF MONITOR Online hyperlinks to the JDENQ view, as shown in following figure, where you can identify the enqueue resource causing the delay and find out why NITEBAT spent so much time contending for it.

    DDMMMYYYY  HH:MM:SS ------ MainView WINDOW INTERFACE (Vv.r.mm) ----------------
    COMMAND  ===>                                                 SCROLL ===> PAGE
    CURR WIN ===> 1        ALT WIN ===>                                            
    >H1 =JDENQ=============SYSE=====*========DDMMMYYYY==HH:MM:SS====CMF======D====4
    C Waiting  JES Job  T SrvClass   %Delay  %Delay Wait Major    Minor RName    EN
    - Job----- Number   - -------- This Enq All Enq Want QName--- -------------- St
      MV50CAST STC07172 S SLOW                 1.22 Excl SYSZTIOT    = /         En
      MTADOM01 STC07219 S STCNRM               3.26 Excl SYSDSN   LGS1.CNTL      En
      LGS11Q1  STC07177 S SLOW                 3.26 Excl SYSDSN   LGS1.CNTL      En
      NITEBAT  STC07093 S STCNRM              100.0 Excl SYSDSN   SYS.MCS.MCS    En

    The Waiting Job column tells you that NITEBAT is waiting for the logical enqueue resource that is identified by the major name SYSDSN, indicating that the resource is a data set. The minor name, SYS.MCS.MCS, is the name of the data set itself. And if you scrolled to the right by using PF11, you would see from the Owning Job column that a job called DDBBKUP currently owns the resource. 

  4. To find out more about this job, position your cursor under Minor RName and press Enter to display the JUENQ view, as shown in the following figure.

    DDMMMYYYY  HH:MM:SS ------ MainView WINDOW INTERFACE (Vv.r.mm) ----------------
    COMMAND  ===>                                                 SCROLL ===> PAGE
    CURR WIN ===> 1        ALT WIN ===>                                            
    >H1 =JDENQ====JUENQ====SYSE=====*========DDMMMYYYY==HH:MM:SS====CMF======D====1
    C Owning   JES Job  T SrvClass %Use Ownr Major    Minor RName    ENQ    Waiting
    - Job----- Number   - --------  ENQ Has- QName--- -------------- Status Job----
      DDBBKUP  STC07093 S STCNRM   97.2 Excl SYSDSN   SYS.MCS.MCS    Ended  NITEBAT

    There is the problem. DDBBKUP has been assigned exclusive (Excl) use of this enqueue resource, holding it for 97 percent during the 1:15 to 1:30 A.M. interval. All other jobs, including NITEBAT, are restricted from this resource until DDBBKUP completes execution.

    Now that you know what caused last night’s delay, you are in position to ensure that it does not happen again. One solution is to reschedule DDBBKUP so that it runs after NITEBAT has been completed (although your site might prefer an alternative method).

Scenario 2: Is the problem on another system

As you survey the system, you notice from the WRT view that the workload, TSO1, experienced an extremely high response time during performance period 3—a full 17.43 seconds. Performance period 3 is typically characterized by both heavy computations and heavy I/O. Which one is responsible for the TSO1 delay?

To check if the problem is with another system

  1. To begin your investigation, type WDELAY on the COMMAND line to display an overview of all workload delays, as shown in the following figure.

    DDMMMYYYY  HH:MM:SS ------ MainView WINDOW INTERFACE (Vv.r.mm) ----------------
    COMMAND  ===>                                                 SCROLL ===> CSR  
    CURR WIN ===> 1        ALT WIN ===>                                            
    >W1 =WDELAY============SYSE=====*========DDMMMYYYY==HH:MM:SS====CMF======D===55
    C Workload T #AS        Total Delay%   %Dly  %Dly  %Dly  %Dly  %Dly  %Dly  %Dly
    - -------- - ---        0....50...100   CPU   Dev  Stor   ENQ   SRM  Subs  Idle
      TSO1     T   3  75.01 **********    14.77  60.3         
      BATCH    W   2  34.84 ****           3.68 30.37        0.79                  
      BATNRM   S   2  34.84 ****           3.68 30.37        0.79                  
      STCNRM   S  67   2.99                0.04  0.02                    2.93 85.74
      STC      W  71   2.82                0.03  0.02                    2.77 80.91
      ALLWKLDS C 194   1.32                0.07  0.31        0.01        0.93 66.80
      ALLSTC   S 162   1.14                0.04  0.01                    1.10 62.84
      SYSSTC   S  73   0.04                0.04                               44.06
      SYSTEM   S  18   0.04                0.02  0.02                         70.26
      TSONRM   S  28   0.04                0.03              0.01             98.11
      TSO      W  28   0.04                0.03              0.01             98.11
      ALLTSO   T  28   0.04                0.03              0.01             97.97
      SYSTEM   W  91   0.03                0.03  0.00                         49.19
      CICST1   S       0.00                                                        
      CICSNRM  S       0.00                                                        
      APPCHOT  S       0.00                                                        
      CICSHOT  S       0.00                                                        
      RMF      W       0.00                                                        
      IMSNRM   S       0.00

    As you can see, workload TSO1 has been experiencing a delay of 75 percent during the current interval, and 60 percent of that delay was due to some type of device. How widespread is the problem--were all of the address spaces in TSO1 delayed?

    To find out, open another window by using the VS (vertical split) command.

  2. On the COMMAND line, type VS, but do not press Enter yet.
  3. Position your cursor at the %Dly CPU field.
  4. Press Enter.
  5. In the CURR WIN field, type 1.
  6. In the ALT WIN field, type 2.
  7. Hyperlink on the Total Delay% column for workload TSO1.The JDELAY view is displayed in window 2, as shown in the following figure.

    DDMMMYYYY  HH:MM:SS ------ MainView WINDOW INTERFACE (Vv.r.mm) ----------------
    COMMAND  ===>                                                 SCROLL ===> PAGE
    CURR WIN ===> 2        ALT WIN ===>                                            
    >W1 -WDELAY------------SYSE-----*---- >W2 =JDELAY============SYSE=====*========
    C Workload T #AS        Total Delay%  | C Jobname  JES Job  T SrvClass Step    
    - -------- - ---        0....50...100 | - -------- Number   - -------- Data    
      TSO1     T   3  75.01 **********    |   USER1    JOB05805 T PGRP0002 NO    93
      BATCH    W   2  34.84 ****          |   LGS12    JOB05365 T PGRP0002 NO    12
      BATNRM   S   2  34.84 ****          |   DSF1     JOB05809 T PGRP0001 NO       
      STCNRM   S  67   2.99               |
      STC      W  71   2.82               |
      ALLWKLDS C 194   1.32               |                                        
      ALLSTC   S 162   1.14               |                                        
      SYSSTC   S  73   0.04               |                                        
      SYSTEM   S  18   0.04               |                                        
      TSONRM   S  28   0.04               |                                        
      TSO      W  28   0.04               |                                        
      ALLTSO   T  28   0.04               |                                        
      SYSTEM   W  91   0.03               |                                        
      CICST1   S       0.00               |                                        
      CICSNRM  S       0.00               |                                        
      APPCHOT  S       0.00               |                                        
      CICSHOT  S       0.00               |                                        
      RMF      W       0.00               |                                        
      IMSNRM   S       0.00               |

    JDELAY reports the delays experienced by each job in TSO1. As you can see, the job USER1 has been delayed 93.29 percent of the current interval, 92.93 percent of which was spent waiting for a device.

    To find out which device is responsible, open another window by using the HS (horizontal split) command:

  8. On the COMMAND line, type HS and position your cursor about halfway down the screen; press Enter.
  9. In the CURR WIN field, type 2.
  10. Press PF11 to scroll to the right to see the JDELAY %Dly DEV field.
  11. In the ALT WIN field, type 3 to direct the forthcoming view to the new window.
  12. Position the cursor on the JDELAY %Dly DEV field; press Enter.The JDDEV view is displayed in window 3, as shown in the following figure.

    DDMMMYYYY  HH:MM:SS ------ MainView WINDOW INTERFACE (Vv.r.mm) ----------------
    COMMAND  ===>                                                 SCROLL ===> PAGE
    CURR WIN ===> 2        ALT WIN ===>                                            
    >W1 -WDELAY------------SYSE-----*----- >W2 =JDELAY============SYSE=====*========
    C Workload T #AS        Total Delay%   | C Jobname  %Dly  %Dly  %Dly  %Dly  %D
    - -------- - ---        0....50...100  | - --------  CPU   DEV  Stor   ENQ   S
      TSO1     T   3  75.01 **********     |   USER1    3.85 92.93
      BATCH    W   2  34.84 ****           |   LGS12    1.28
      BATNRM   S   2  34.84 ****           |   DSF1     1.28  
      STCNRM   S  67   2.99                |
      STC      W  71   2.82                |
      ALLWKLDS C 194   1.32                |                                        
      ALLSTC   S 162   1.14                |                                        
      SYSSTC   S  73   0.04                |                                        
      SYSTEM   S  18   0.04                |                                        
      TSONRM   S  28   0.04                |                                        
      TSO      W  28   0.04                |                                        
      ALLTSO   T  28   0.04                |                                        
      SYSTEM   W  91   0.03                |                                        
      CICST1   S       0.00                |                                        
    >W3 -JDDEV-------------SYSE-----*----- |                                        
    C Jobname  T SrvClass %Dly %Delay %Dly |                                        
    - -------- - -------- DASD Volser Tape |                                        
      USER1    S SYSTEM  92.93  92.93      |                                        
                                           |

    JDDEV displays information about jobs delayed because of contention for one or more devices during the interval. In this case, USER1 has a problem due to DASD device delays of 92.93%.

    Using the CMF MONITOR hyperlinks that are available on most fields, you can explore any problem to the desired degree of depth. For example, you can hyperlink from VOLSER SYSR2C to DEVINFO, which shows detailed information about the device specified; from there, you can hyperlink to other fields.

    You can also use the CONtext command, with an SSI context name if your site has defined one, to see if there is contention for your device on another system. For information about using CONtext and SSI, see Using-CMF-MONITOR-Online-on-multiple-systems.

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*