Scenario 4: Why Is MFGTSO not meeting its service objectives?
A phone call from Manufacturing informs you that the department’s TSO users are not receiving adequate response time. You have to find out why and resolve the problem.
To verify a response time problem
Your first step is to verify that a problem actually exists. You know that the workload for the department, MFGTSO, is supposed to have a response time of 3 seconds for 99 percent of its transactions. How successful is MFGTSO in meeting this objective?
To find out, follow this procedure:
On the COMMAND line, type WOBJ MFGTSO.
The WOBJ view is displayed, as shown in the following figure.DDMMMYYYY HH:MM:SS ------ MAINVIEW WINDOW INTERFACE (Vv.r.mm) ----------------
COMMAND ===> SCROLL ===> PAGE
CURR WIN ===> 1 ALT WIN ===>
W1 =WOBJ==============SYSE=====*========DDMMMYYYY==HH:MM:SS====MVMVS====D====1
C Workload Intvl Typ #AS % Service Objective Tran Tran Job Jobs
- -------- Time- --- --- 0.......50......100 Rate Total Total /Min
MFGTS0 15:33 TSO 14 65.0 ************ 0.04 337As you can see, a problem does indeed exist--MFGTSO is meeting only 65 percent of its service objectives. Obviously MFGTSO is experiencing a delay somewhere in the system.
Position your cursor in the Workload field, and then press Enter to hyperlink on MFGTSO.
A menu displays a range of view choices, all specific to MFGTSO, as shown in the following figure.DDMMMYYYY HH:MM:SS ------ MAINVIEW WINDOW INTERFACE (Vv.r.mm) ----------------
COMMAND ===> SCROLL ===> PAGE
CURR WIN ===> 1 ALT WIN ===>
W1 =WOBJ=====EZMWORK==SYSE=====*========DDMMMYYYY==HH:MM:SS====MVMVS====D====1
Workload Menu > View RealTime Menu
Timeframe - Interval
Current Workload -> MFGTSO
Activity +----------------------+ Resource Usage
. Overview | Place cursor on | . Service Objectives
. Workflow | menu item and | . SRM Service Units
. Delay Reasons | press ENTER | . Response Times
. Using Resources +----------------------+ . Address Spaces
. Paging
. Trending . Administration
. Return...Place the cursor on Delay Reasons, and then press Enter to determine the causes of any delays to MFGTSO address spaces.
The WDELAY view is displayed, as shown in the following figure.DDMMMYYYY HH:MM:SS ------ MAINVIEW WINDOW INTERFACE (Vv.r.mm) ----------------
COMMAND ===> SCROLL ===> PAGE
CURR WIN ===> 1 ALT WIN ===>
>W1 =WOBJ=====WDELAY===SYSE=====*========DDMMMYYYY==HH:MM:SS====MVMVS====D====1
C Workload Intvl T #AS Total Delay% %Dly %Dly %Dly %Dly %Dly %Dly
- -------- Time- - --- 0....50...100 CPU Dev Stor ENQ SRM Subs
MFGTSO 15:33 T 33 45.03 ***** 11.26 8.8 25.04According to the Total Delay% column, MFGTSO spent 45 percent of the last interval waiting for one or more resources. The %Dly SRM field tells you that the highest portion of the delay, 25 percent, was due to SRM swapping.
To find out what kind of swapout is causing the delay, you need the WSRMD view, which tells you how long a workload is delayed due to SRM-recommended swapouts.Hyperlink on %Dly SRM or, on the COMMAND line type WSRMD MFGTSO.
The WSRMD view is displayed, as shown in the following figure.DDMMMYYYY HH:MM:SS ------ MAINVIEW WINDOW INTERFACE (Vv.r.mm) ----------------
COMMAND ===> SCROLL ===> PAGE
CURR WIN ===> 1 ALT WIN ===>
W1 =WOBJ=====WSRMD====SYSE=====*========DDMMMYYYY==HH:MM:SS====MVMVS====D====1
C Workload Intvl Typ #AS %Dly %Dly %Dly %Dly %Dly %Dly %Dly %Dly %Dly
- -------- Time- --- --- SRM AuxS RealS ReqSw NqxSw ExcSw UniSw TrnSw RTO
MFGTSO 15:33 SCL 5 25.0 0.8 18.2One look at the WSRMD %Dly UniSw field tells you that the highest portion of the total delay, 18.2 percent, is due to unilateral swapout. What is causing this delay? Is this a system-wide problem, or is it confined solely to the domain containing MFGTSO? To find out, you continue the investigation.
Position your cursor under the %Dly UniSw field, and then press Enter to hyperlink to the SWPINFO view, as shown in the following figure.
DDMMMYYYY HH:MM:SS ------ MAINVIEW WINDOW INTERFACE (Vv.r.mm) ----------------
COMMAND ===> SCROLL ===> PAGE
CURR WIN ===> ALT WIN ===>
W1 =SWPINFO===========SYSE=====*========DDMMMYYYY==HH:MM:SS====MVMVS====D====1
Swap Swap Log Log Expd Expd Aux Total
Reason --Rate -Swaps Effect Direct Effect Direct Swaps
Term In... 0.8 100.0 0.0 0.0 100.0 0.0 6.35
Term Out.. 2.6 100.0 0.0 0.0 100.0 0.0 20.63
Long wait. 0.0 0.0 0.0 0.0 0.0 0.0 0.00
Det. wait. 0.0 0.0 0.0 0.0 0.0 0.0 0.00
Unilateral 8.2 0.0 0.0 100.0 100.0 0.0 65.08
Enqueue... 0.0 0.0 0.0 0.0 0.0 0.0 0.00
Exchange.. 0.0 0.0 0.0 0.0 0.0 0.0 0.00
Request... 0.0 0.0 0.0 0.0 0.0 0.0 0.00
Aux. stor. 0.0 0.0 0.0 0.0 0.0 0.0 0.00
Cent. stor 1.0 100.0 0.0 0.0 100.0 0.0 7.94
Total Swap 12.6 95.7 0.0 4.3 100.0 0.0 100.00According to the information displayed in the Swap Rate column, the total unilateral swap rate is indeed quite high: 8.2 swaps per second. Clearly, this is a system-wide problem. The probable explanation is that an SRM MPL adjustment threshold has been exceeded, and that SRM attempts to compensate are adversely affecting the MFGTSO performance.
To prove your theory, you need additional information about how SRM is operating.On the COMMAND line, type MPLSTAT.
The SRM system MPL thresholds and their current values are displayed, as shown in the following figure.DDMMMYYYY HH:MM:SS ------ MAINVIEW WINDOW INTERFACE (Vv.r.mm) ----------------
COMMAND ===> SCROLL ===> PAGE
CURR WIN ===> 1 ALT WIN ===>
W1 =MPLSTAT===========SYSE=====*========DDMMMYYYY==HH:MM:SS====MVMVS====D====1
Keyword Low Curr High Effect Comb Description
------- Thresh Value Thresh ------ Ind --------------------
RCCCPUT 128 104 128 Inc No CPU Utilization
RCCCPUP 0 104 0 None Yes CPU Utilization
RCCMSPT 20 100 40 Dec Yes Page delay time
RCCPDLT 0 0 0 None No Page delay time
RCCPTRT 60 0 80 Inc No Page fault rate
PAGERTn 10 0 0 None Yes Demand paging rate
RCCASMT 1000 0 90 Inc No ASM queue length
RCCUICT 2 180 4 Inc No Unref Interval Count
RCCFXTT 66 38 72 Inc No % of storage fixed
RCCFXET 82 28 88 Inc No %storage <16M fixedYour theory is correct. According to the RCCMSPT row, the system’s high threshold is set for 40 milliseconds, and the current value is 100. SRM is trying to reduce the work in the system by lowering domain MPLs.
One solution is to simply change the page delay time high threshold to 100 or higher. Before doing so, however, you must first ensure that you will not cause another problem by increasing the burden on the page data set configuration.To survey the configuration’s current status, display the PGDSTAT view, as shown in the following figure.
DDMMMYYYY HH:MM:SS ------ MAINVIEW WINDOW INTERFACE (Vv.r.mm) ----------------
COMMAND ===> SCROLL ===> PAGE
CURR WIN ===> 1 ALT WIN ===>
>W1 =PGDSTAT===========SYSE=====*========DDMMMYYYY==HH:MM:SS====MVMVS====D====4
C DS PGDS Volser Dev Sts %Slts Page I/O Rq AvgPg V %Busy Dataset Name
- -- Type- ------ Num --- Used XfrTm Rate / I/O - ----- -------------------
0 PLPA MVSD12 D12 OK 64.24 37.44 0.05 0.64 N PAGE.VMVSD12.PLPA
1 COMN MVSD12 D12 OK 0.52 N PAGE.VMVSD12.COMMON
3 LOCL MVSD12 D12 OK 92.31 25.21 1.25 7.95 Y 38.04 PAGE.VPAGE01.LOCAL1
4 LOCL MVSD12 D12 OK 5.01 4.62 0.94 2.94 N 2.59 PAGE.VPAGE02.LOCAL2
5 LOCL PAGEC1 917 OK 6.38 14.20 0.27 3.64 N 0.52 PAGE.VPAGEC1.LOCAL3
6 LOCL PAGEC1 917 OK 5.43 15.19 0.13 7.30 N 0.52 PAGE.VPAGEC2.LOCAL4
7 LOCL PAGEC1 917 OK 18.82 17.64 0.15 8.06 N 0.52 PAGE.VPAGEC3.LOCAL5
8 LOCL PAGEC1 917 OK 16.23 9.71 0.13 9.58 N PAGE.VPAGEC4.LOCAL6It is a good thing that you checked. PAGE.VPAGE01.LOCAL is the only local data set enabled for VIO and is already under heavy demand. If you do not add another VIO-enabled data set before changing the RCCMSPT threshold, you will create a significant local paging delay.
To add another VIO-enabled data set
- On the COMMAND line, type HS (horizontal split) to open another window, position your cursor halfway down the screen, and then press Enter.
- On the COMMAND line, type CONSOLE to simulate the z/OS console, and then press Enter.
- On the COMMAND line, issue the PAGEADD command of MVS, specifying a data set (for example, / PAGEADD PAGE.SJSE.COM), and then press Enter.
Now that you have added an additional data set to offset the load on PAGE.VPAGE01.LOCAL, you can adjust both thresholds for RCCMSPT safely.
To adjust both thresholds while using the CONSOLE view
- Access SYS1.PARMLIB(IEAIPSxx) in edit mode.
- Assign higher values to RCCMSPL and RCCMSPH.
- Issue the SET command to refresh the SRM IPS.
Because of these increased values, SRM should no longer attempt to reduce work in the system. Was this the correct solution? Is MFGTSO now meeting its service objective?To find out, display WOBJ, as shown in the following figure.
DDMMMYYYY HH:MM:SS ------ MAINVIEW WINDOW INTERFACE (Vv.r.mm) ----------------
COMMAND ===> SCROLL ===> PAGE
CURR WIN ===> 1 ALT WIN ===>
>W1 =WOBJ=============SYSE=====*========DDMMMYYYY==HH:MM:SS=====MVMVS====D====4
C Workload Intvl Typ #AS % Service Objective Tran Tran Job Jobs
- -------- Time- --- --- 0.......50......100 Rate Total Total /Min
JCTEST 09:28 TSO 4 101.3 ******************+ 0.61 325
MFGTS0 09:28 TSO 10 112.0 ******************+ 0.57 418You can see that MFGTSO is now successfully meeting its service objectives.
Related topic