Scenario 4: Why Is MFGTSO not meeting its service objectives?


A phone call from Manufacturing informs you that the department’s TSO users are not receiving adequate response time. You have to find out why and resolve the problem.

To verify a response time problem

Your first step is to verify that a problem actually exists. You know that the workload for the department, MFGTSO, is supposed to have a response time of 3 seconds for 99 percent of its transactions. How successful is MFGTSO in meeting this objective?

To find out, follow this procedure:

  1. On the COMMAND line, type WOBJ MFGTSO.
    The WOBJ view is displayed, as shown in the following figure.

    DDMMMYYYY  HH:MM:SS ------ MAINVIEW WINDOW INTERFACE (Vv.r.mm) ----------------
    COMMAND  ===>                                                 SCROLL ===> PAGE
    CURR WIN ===> 1        ALT WIN ===>                                            
     W1 =WOBJ==============SYSE=====*========DDMMMYYYY==HH:MM:SS====MVMVS====D====1
     C Workload Intvl Typ #AS       % Service Objective  Tran   Tran   Job  Jobs    
     - -------- Time- --- ---       0.......50......100  Rate  Total Total  /Min    
       MFGTS0   15:33 TSO  14  65.0 ************         0.04    337

    As you can see, a problem does indeed exist--MFGTSO is meeting only 65 percent of its service objectives. Obviously MFGTSO is experiencing a delay somewhere in the system.

  2. Position your cursor in the Workload field, and then press Enter to hyperlink on MFGTSO.
    A menu displays a range of view choices, all specific to MFGTSO, as shown in the following figure.

    DDMMMYYYY  HH:MM:SS ------ MAINVIEW WINDOW INTERFACE (Vv.r.mm) ----------------
    COMMAND  ===>                                                 SCROLL ===> PAGE
    CURR WIN ===> 1        ALT WIN ===>                                            
     W1 =WOBJ=====EZMWORK==SYSE=====*========DDMMMYYYY==HH:MM:SS====MVMVS====D====1
                                      Workload Menu        > View RealTime Menu
                                   Timeframe - Interval                             
                                                                                   
            Current Workload ->  MFGTSO                                             
                                                                                   
         Activity                +----------------------+    Resource Usage         
       . Overview                |   Place cursor on    |  . Service Objectives     
       . Workflow                |    menu item and     |  . SRM Service Units      
       . Delay Reasons           |     press ENTER      |  . Response Times         
       . Using Resources         +----------------------+  . Address Spaces         
       . Paging                                                                     
       . Trending                                          . Administration   

                                                           . Return...
  3. Place the cursor on Delay Reasons, and then press Enter to determine the causes of any delays to MFGTSO address spaces.
    The WDELAY view is displayed, as shown in the following figure.

    DDMMMYYYY  HH:MM:SS ------ MAINVIEW WINDOW INTERFACE (Vv.r.mm) ----------------
    COMMAND  ===>                                                 SCROLL ===> PAGE
    CURR WIN ===> 1        ALT WIN ===>                                            
    >W1 =WOBJ=====WDELAY===SYSE=====*========DDMMMYYYY==HH:MM:SS====MVMVS====D====1
     C Workload Intvl T #AS        Total Delay%   %Dly  %Dly  %Dly  %Dly  %Dly  %Dly
     - -------- Time- - ---        0....50...100   CPU   Dev  Stor   ENQ   SRM  Subs
       MFGTSO   15:33 T  33  45.03 *****         11.26   8.8             25.04

    According to the Total Delay% column, MFGTSO spent 45 percent of the last interval waiting for one or more resources. The %Dly SRM field tells you that the highest portion of the delay, 25 percent, was due to SRM swapping.
    To find out what kind of swapout is causing the delay, you need the WSRMD view, which tells you how long a workload is delayed due to SRM-recommended swapouts.

  4. Hyperlink on %Dly SRM or, on the COMMAND line type WSRMD MFGTSO.
    The WSRMD view is displayed, as shown in the following figure.

    DDMMMYYYY  HH:MM:SS ------ MAINVIEW WINDOW INTERFACE (Vv.r.mm) ----------------
    COMMAND  ===>                                                 SCROLL ===> PAGE
    CURR WIN ===> 1        ALT WIN ===>                                            
     W1 =WOBJ=====WSRMD====SYSE=====*========DDMMMYYYY==HH:MM:SS====MVMVS====D====1
     C Workload Intvl Typ #AS  %Dly  %Dly  %Dly  %Dly  %Dly  %Dly  %Dly  %Dly  %Dly
     - -------- Time- --- ---   SRM  AuxS RealS ReqSw NqxSw ExcSw UniSw TrnSw   RTO
       MFGTSO   15:33 SCL   5  25.0               0.8              18.2

    One look at the WSRMD %Dly UniSw field tells you that the highest portion of the total delay, 18.2 percent, is due to unilateral swapout. What is causing this delay? Is this a system-wide problem, or is it confined solely to the domain containing MFGTSO? To find out, you continue the investigation.

  5. Position your cursor under the %Dly UniSw field, and then press Enter to hyperlink to the SWPINFO view, as shown in the following figure.

    DDMMMYYYY  HH:MM:SS ------ MAINVIEW WINDOW INTERFACE (Vv.r.mm) ----------------
    COMMAND  ===>                                                 SCROLL ===> PAGE
    CURR WIN ===>          ALT WIN ===>                                            
     W1 =SWPINFO===========SYSE=====*========DDMMMYYYY==HH:MM:SS====MVMVS====D====1
       Swap           Swap      Log      Log     Expd     Expd      Aux    Total    
       Reason       --Rate   -Swaps   Effect   Direct   Effect   Direct    Swaps    
       Term In...      0.8    100.0      0.0      0.0    100.0      0.0     6.35    
       Term Out..      2.6    100.0      0.0      0.0    100.0      0.0    20.63    
       Long wait.      0.0      0.0      0.0      0.0      0.0      0.0     0.00    
       Det. wait.      0.0      0.0      0.0      0.0      0.0      0.0     0.00    
       Unilateral      8.2      0.0      0.0    100.0    100.0      0.0    65.08    
       Enqueue...      0.0      0.0      0.0      0.0      0.0      0.0     0.00    
       Exchange..      0.0      0.0      0.0      0.0      0.0      0.0     0.00    
       Request...      0.0      0.0      0.0      0.0      0.0      0.0     0.00    
       Aux. stor.      0.0      0.0      0.0      0.0      0.0      0.0     0.00    
       Cent. stor      1.0    100.0      0.0      0.0    100.0      0.0     7.94    
       Total Swap     12.6     95.7      0.0      4.3    100.0      0.0   100.00

    According to the information displayed in the Swap Rate column, the total unilateral swap rate is indeed quite high: 8.2 swaps per second. Clearly, this is a system-wide problem. The probable explanation is that an SRM MPL adjustment threshold has been exceeded, and that SRM attempts to compensate are adversely affecting the MFGTSO performance.
    To prove your theory, you need additional information about how SRM is operating.

  6. On the COMMAND line, type MPLSTAT.
    The SRM system MPL thresholds and their current values are displayed, as shown in the following figure.

    DDMMMYYYY  HH:MM:SS ------ MAINVIEW WINDOW INTERFACE (Vv.r.mm) ----------------
    COMMAND  ===>                                                 SCROLL ===> PAGE
    CURR WIN ===> 1        ALT WIN ===>                                            
     W1 =MPLSTAT===========SYSE=====*========DDMMMYYYY==HH:MM:SS====MVMVS====D====1
       Keyword      Low     Curr     High   Effect   Comb   Description             
       -------   Thresh    Value   Thresh   ------    Ind   --------------------    
       RCCCPUT      128      104      128      Inc     No   CPU Utilization         
       RCCCPUP        0      104        0     None    Yes   CPU Utilization         
       RCCMSPT       20      100       40      Dec    Yes   Page delay time         
       RCCPDLT        0        0        0     None     No   Page delay time         
       RCCPTRT       60        0       80      Inc     No   Page fault rate         
       PAGERTn       10        0        0     None    Yes   Demand paging rate      
       RCCASMT     1000        0       90      Inc     No   ASM queue length        
       RCCUICT        2      180        4      Inc     No   Unref Interval Count    
       RCCFXTT       66       38       72      Inc     No   % of storage fixed      
       RCCFXET       82       28       88      Inc     No   %storage <16M fixed

    Your theory is correct. According to the RCCMSPT row, the system’s high threshold is set for 40 milliseconds, and the current value is 100. SRM is trying to reduce the work in the system by lowering domain MPLs.
    One solution is to simply change the page delay time high threshold to 100 or higher. Before doing so, however, you must first ensure that you will not cause another problem by increasing the burden on the page data set configuration.

  7. To survey the configuration’s current status, display the PGDSTAT view, as shown in the following figure.

    DDMMMYYYY  HH:MM:SS ------ MAINVIEW WINDOW INTERFACE (Vv.r.mm) ----------------
    COMMAND  ===>                                                 SCROLL ===> PAGE
    CURR WIN ===> 1        ALT WIN ===>                                            
    >W1 =PGDSTAT===========SYSE=====*========DDMMMYYYY==HH:MM:SS====MVMVS====D====4
     C DS PGDS  Volser  Dev Sts %Slts  Page I/O Rq AvgPg V %Busy Dataset Name
     - -- Type- ------  Num ---  Used XfrTm   Rate / I/O - ----- -------------------
        0 PLPA  MVSD12  D12 OK  64.24 37.44   0.05  0.64 N       PAGE.VMVSD12.PLPA
        1 COMN  MVSD12  D12 OK   0.52                    N       PAGE.VMVSD12.COMMON
        3 LOCL  MVSD12  D12 OK  92.31 25.21   1.25  7.95 Y 38.04 PAGE.VPAGE01.LOCAL1
        4 LOCL  MVSD12  D12 OK   5.01  4.62   0.94  2.94 N  2.59 PAGE.VPAGE02.LOCAL2
        5 LOCL  PAGEC1  917 OK   6.38 14.20   0.27  3.64 N  0.52 PAGE.VPAGEC1.LOCAL3
        6 LOCL  PAGEC1  917 OK   5.43 15.19   0.13  7.30 N  0.52 PAGE.VPAGEC2.LOCAL4
        7 LOCL  PAGEC1  917 OK  18.82 17.64   0.15  8.06 N  0.52 PAGE.VPAGEC3.LOCAL5
        8 LOCL  PAGEC1  917 OK  16.23  9.71   0.13  9.58 N       PAGE.VPAGEC4.LOCAL6

    It is a good thing that you checked. PAGE.VPAGE01.LOCAL is the only local data set enabled for VIO and is already under heavy demand. If you do not add another VIO-enabled data set before changing the RCCMSPT threshold, you will create a significant local paging delay.

To add another VIO-enabled data set

  1. On the COMMAND line, type HS (horizontal split) to open another window, position your cursor halfway down the screen, and then press Enter.
  2. On the COMMAND line, type CONSOLE to simulate the z/OS console, and then press Enter.
  3. On the COMMAND line, issue the PAGEADD command of MVS, specifying a data set (for example, / PAGEADD PAGE.SJSE.COM), and then press Enter.
    Now that you have added an additional data set to offset the load on PAGE.VPAGE01.LOCAL, you can adjust both thresholds for RCCMSPT safely.

To adjust both thresholds while using the CONSOLE view

  1. Access SYS1.PARMLIB(IEAIPSxx) in edit mode.
  2. Assign higher values to RCCMSPL and RCCMSPH.
  3. Issue the SET command to refresh the SRM IPS.
    Because of these increased values, SRM should no longer attempt to reduce work in the system.
  4. Was this the correct solution? Is MFGTSO now meeting its service objective?To find out, display WOBJ, as shown in the following figure.

    DDMMMYYYY  HH:MM:SS ------ MAINVIEW WINDOW INTERFACE (Vv.r.mm) ----------------
    COMMAND  ===>                                                 SCROLL ===> PAGE
    CURR WIN ===> 1        ALT WIN ===>                                            
    >W1 =WOBJ=============SYSE=====*========DDMMMYYYY==HH:MM:SS=====MVMVS====D====4
    C Workload Intvl Typ #AS       % Service Objective  Tran   Tran   Job  Jobs    
     - -------- Time- --- ---       0.......50......100  Rate  Total Total  /Min    
       JCTEST   09:28 TSO   4 101.3 ******************+  0.61    325                
       MFGTS0   09:28 TSO  10 112.0 ******************+  0.57    418

     You can see that MFGTSO is now successfully meeting its service objectives.

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*