HCI diagnosis and debugging


If problems occur with the HCI, we recommend that you provide the following information to BMC Support that aids in diagnosing HCI problems:

  • Storage dumps
  • HCI journal
  • Stub tracing
  • Generalized Trace Facility (GTF) trace.

This topic describes the information types.

Storage Dumps

The HCI invokes the z/OS dumping facilities whenever it detects a problem that cannot automatically be corrected. HCI can request two major types of dumps: SNAP dumps and SDUMPs.
SNAP dumps can be directed to a disk data set or to SYSOUT, based on the DMPCLAS, DMPPFX, DMPUNIT, and DMPVOL attributes on the HCICNGCA element. Specifying DMPCLAS makes specifying the other DMPxxx attributes unnecessary. Specifying DMPPFX, DMPUNIT, and DMPVOL makes specifying DMPCLAS unnecessary. SNAP dumps are relatively slow to generate; consequently, HCI execution may be suspended for an unacceptably long length of time.
SDUMPs are always directed to the installation-defined SYS1.DUMPxx (or equivalent) data sets. These dumps can be managed by the installation more easily than SNAP dumps, and they can be copied easily to disk or tape. These dumps are generated in a very short length of time, thus suspending HCI execution for a minimal duration. Request an SDUMP by coding the DMPPFX=SDUMP attribute (and omitting DMPCLAS and DMPVOL attributes) on the HCICNGCA element. We strongly recommend that you code DMPPFX=SDUMP at your site.

Ensure that any dumps sent to BMC contain the following information:

  • Abending PSW
  • Abending register contents
  • Storage: region, private, common, LPA, SQA, and LSQA
  • Region system control blocks, such as ASCB, TCBs, IRBs, among others.
  • Global resource enqueues
  • System trace tables
  • Dump summary.

If you are generating an SDUMP, specify the following options:

RGN,PSA,CSA,LPA,SQA,LSQA,GRSQ,TRT,SUM

and

TYPE=XMEME

If the HCI is running in a sysplex and one or more of the TP applications are running on a different z/OS image than the HCI, then also specify the following option:

COUPLE

HCI Journal

The HCI journal provides the most comprehensive debugging aid available. Although not required for most HCI execution, when a problem occurs, the journal should be created that contains the execution error. The journal facility is always available in the HCI, but based upon the setting of the journal mask attribute in the HCICNGCA element and the availability of journal VSAM data sets, the journal may not be active at any given time.
If you need to provide BMC with the journal data for analysis, prepare it for SFT transmission.

Error
Warning

Do not send a print image of the journal data to BMC. Doing so will only result in a delay to the problem analysis. For more information about SFT transmission, see Preparing Journal Data for SFT Transmission below and TCP/IP.

Preparing Journal Data for SFT Transmission

Reformat the contents of a journal data set into a data set that is appropriate for SFT transmission to BMC's SFT site. The following shows an example of the JCL to perform this reformat.

JCL to Reformat Journal Data

Information
Example

//*
//* PLACE YOUR JOB CARD HERE
//*
//UNLOAD EXEC PGM=HCIJUNLD
//STEPLIB DD DSN=users.authorized.hci.loadlib,DISP=SHR
//HCIINDD DD DSN=HCI.JOURNAL.DATA,DISP=SHR
//HCIOUTDD DD DSN=users.new.journal,DISP=(,CATLG),
// UNIT=SYSDA,SPACE=(TRK,(15,15),RLSE,CONTIG)
//SYSUDUMP DD SYSOUT=*

Once the data has been successfully unloaded, the unloaded data set can be transmitted via SFT to BMC's SFT site. See  Using the TCP/IP SFT Program for more information.

Printing the Journal Data

Do not send a print image of the journal data be sent to BMC because doing so will delay problem analysis. However, if you do need to print the journal data, use the JCL shown in the following code block to minimally format the journal data.

JCL to Print Journal Data

Information
Example

//*
//* PLACE YOUR JOB CARD HERE
//*
//JRNLPRT EXEC PGM=HCIJRPRT,REGION=32M
//STEPLIB DD DSN=users.authorized.hci.loadlib,DISP=SHR
//*
//LOGIN DD DSN=HCI.JOURNAL.DATA,DISP=SHR
//SYSPRINT DD SYSOUT=*
//LOGPRINT DD SYSOUT=*
//SYSUDUMP DD SYSOUT=*

GTF Trace

Instances for which a Generalized Trace Facility (GTF) trace may be required include:

  • VTAM trace
  • TCP/IP trace
  • Sysplex trace.

All aspects of preparing, starting and stopping the GTF, as well as sending the GTF data remain the same. The only difference is the GTF parameters.

In an active sysplex environment, you cannot use the HCI journal facility to record the execution of the z/OS systems other than the local z/OS that contains the HCI itself.
Because some sort of journal is required, use the z/OS GTF to record sysplex processing.

Preparing GTF to Gather HCI Data

Do the following to prepare to use the GTF to gather HCI data:

  1. Allocate a data set to contain the GTF trace output.
  2. Create a GTF procedure to invoke GTF with the data set.
  3. Create a SYS1.PARMLIB member to contain the GTF execution time parameters.

The following code block shows an example of a job that allocates a data set to receive the GTF trace data.

JCL to Allocate a GTF Trace data set

Information
Example
//*
//* PLACE YOUR JOB CARD HERE
//*
//ALLOCEXEC PGM=IEFBR14
//GTFDATA DD DSN=HCI.GTFTRACE.DATA,DISP=(,CATLG),UNIT=SYSDA,
// SPACE=(CYL,(40),,CONTIG),VOL=SER=volser,
// DCB=(RECFM=VB,LRECL=8232,BLKSIZE=8236,DSORG=PS)

Specify a data set name and volume serial number appropriate for your site's installation. Code the DCB parameters exactly as shown.

The following code block shows an example of a procedure to invoke GTF.

Procedure to Invoke GTF

Information
Example

//GTFHCI PROC MEMBER=GTFHCI
//IEFPROC EXEC PGM=AHLGTF,PARM='MODE=EXT,DEBUG=NO,TIME=YES',
// REGION=2280K,DPRTY=(15,15)
//IEFRDER DD DSNAME=HCI.GTFTRACE.DATA,DISP=SHR
//SYSLIB DD DSNAME=SYS1.PARMLIB(&MEMBER),DISP=SHR

The following is an example of a SYS1.PARMLIB member associated with the GTF procedure.

TRACE=USRP USR=(FE1,FE2,FE4,FEF,FF0,FF1,FF2) END


Starting GTF

Start the GTF procedure by entering a command on the system console or from an SDSF panel. The following is an example of this command.

START GTFHCI.GTF

The GTF program displays startup information, including a listing of the parameters that are to be used. If its initialization is successful, GTF requests that the operator confirm the startup parameters by replying to an outstanding WTOR. This message is shown below:

*nn AHL125A RESPECIFY TRACE OPTIONS OR REPLY U

Where nn is the reply number. Using the reply number nn, the operator enters the following on the system console or on an SDSF panel:

Rnn,U

GTF displays more diagnostic information, ending with the following message:

AHL031I GTF INITIALIZATION COMPLETE
Warning

Important

When GTF has successfully been started, actual tracing to the GTF data set does not occur until the next TP registers with the HCI. Thus, in-flight conversations are not be traced. We recommend starting GTF before starting any HCI TPs. In this way, all information concerning those TPs is traced.

Stopping GTF

Stop GTF by entering the following command on the system console or from an SDSF panel:

P GTF

GTF responds with the following message:

AHL006I GTF ACKNOWLEDGES STOP COMMAND

Sending GTF Data for Analysis

You can reformat the contents of the GTF data set into a data set that is appropriate for Secure File Transfer (SFT) transmission to BMC's Secure File Transfer (SFT) site or placed onto magnetic tape to be mailed. Refer to one of the two applicable sections.

Preparing GTF Data for SFT Transmission

In order to send the GTF data via SFT, reformat the data. The following code block shows an example of the JCL to perform this reformat.

JCL to Reformat GTF Data

Information
Example

//*
//* PLACE YOUR JOB CARD HERE
//*
//UNLOAD EXEC PGM=HCIJUNLD
//STEPLIB DD DSN=users.authorized.hci.loadlib,DISP=SHR
//HCIINDD DD DSN=HCI.GTFTRACE.DATA,DISP=SHR
//HCIOUTDD DD DSN=users.new.gtfdata,DISP=(,CATLG),
// UNIT=SYSDA,SPACE=(TRK,(15,15),RLSE,CONTIG)
//SYSUDUMP DD SYSOUT=*


Once the GTF data has been successfully unloaded, you can transmit the unloaded data set via SFT to BMC's SFT site.

Using the TCP/IP SFT Program

The programs HCIJUNLD and HCIGUNLD are designed to read the variable length input data and create a fixed length output data set that can be transmitted by File Transfer Protocol (SFT). The format of the records created by HCIJUNLD and HCIGUNLD is known by the corresponding programs that re-create the original data sets.

Invoking SFT

If TCP/IP is available on a z/OS host, SFT can be invoked from a TSO session, with or without ISPF. If TCP/IP is not available, the output of HCIJUNLD or HCIGUNLD must be transmitted to a workstation that does have TCP/IP on it before the data set can be sent to BMC.

The remainder of this section assumes SFT is available on the host.

Assuming a host-based TCP/IP and no fire wall, enter the commands shown in the following example on any ISPF screen. Commands are shown with a right arrow (===>) prefix.

SFT Session with BMC AMI

Information
Example

===> TSO SFT mft.bmc.com
Using 'SYS1.TCPPARMS(SFTDATA)' for local site configuration parameters. IBM SFT CS V1R11
Connecting to: Secure File Transfer (SFT) (mft.bmc.com) 192.168.97.200 port: 21.
220 Please contact mft.bmc.com if you have any problems or questions. NAME (mft.bmc.com):

===> ANONYMOUS
>>>User anonymous
331 Password required for anonymous.
PASSWORD:

===> USERID@CUSTOMER.COM
>>>PASS
230 Login OK. Proceed.
Command:

===> CD PUB/CSS/INCOMING
>>>CWD pub/css/incoming
250 Folder changed to "/pub/css/incoming".
Command:

===> BIN
>>>TYPE I
200 Type set to I
Command:

===> PUT HCI.OBJECT(HCI30) HCI30.BIN
>>>SITE FIXrecfm 41600 LRECL=4160 RECFM=F BLKSIZE=4160
501 Command not understood. (You can ignore this error.)
>>>PORT 10,10,0,204,167,33
200 Command okay.
>>>STOR hci30.txt
150 Opening BINARY mode data connection for hci30.bin.
226 Transfer complete. 12720 bytes transferred. 12720 bps.
12720 bytes transferred in 0.005 seconds. Transfer rate 2544.00 Kbytes/sec
Command:

===> QUIT
221 Service closing control connection.

Storage Estimates

The storage utilized by the HCI is very use-oriented. The type of storage that is being used can be categorized into four types each above and below the line:

  • Private
  • Common
  • Module
  • Buffer

Private and common storage can each be more than 1 MB above the line. The storage used below the line for private and common is minimal at around 20 KB. Module storage is under 700 KB, the majority of it above the line (less than 50 KB below the line). Buffer storage can be less than 100 KB above the line. Above the line storage is always used unless unavailable or not permitted to be used by some of the very few z/OS facilities. No storage is allocated above the 64-bit bar.

These amounts are all rough estimates and the actual storage size is directly related to the functions the HCI is performing. The actual storage utilization can be seen using the HCILOOK facility or displayed on the operator console using the DISPLAY STOR HCI command.
To help explain where some of the storage is used, the following table shows those control blocks whose quantity is user-controlled:

User-Controlled Control Blocks Storage

Control Blocks

Description

AMB

Access Method Block 728 CSA below

CCB

Conversation Control Block 1264 Private above

DCB

Destination Control Block 760 Private above

JCB

Journal Control Block 104 Private above

PCB

TCP/IP Port Control Block 488 Private above

RCB

Region Control Block 96 CSA above

TPT

TP Profile Table 320 Private above

UIB

User Interface Block 1040 CSA above

WRE

Work Request Element 256 CSA above

The WREs are usually the largest storage item. Currently the HCI is intolerant of running out of most internal control blocks. Always provide a few extra of each where possible and monitor storage utilization via the TSO/ISPF HCILOOK facility.


 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*

BMC AMI Enterprise Common Components 17.02