General Considerations for Optimum Performance

Most of the IAM data sets will achieve outstanding performance results, particularly with the recommended Global Options settings. In general the IAM files do not require any routine monitoring. If performance is not up to your expectations, or if you want to make sure that you are receiving the best performance possible with all of your data sets then consider the following general guidelines.

Make sure that you have a way to obtain IAMINFO reports. This can be done either by adding an IAMINFO DD card to the JCL for job steps using IAM files or by collecting the IAM SMF records, and then post processing the data with the IAMINFO command of the IAMSMF utility program. Use of the IAMSMF program is recommended, particularly because it can generate a CSV format file of the IAMINFO report and data can be analyzed in spread sheet format. These reports contain critical information for detailed tuning. Become familiar with the contents of these reports, as they provide a lot of useful information. By activating and collecting the IAM SMF records, you can also utilize the IAMSMFVS reports for a more concise report format that will make it easy to find those data sets that might require additional attention.
Periodically review the IAMINFO reports. If more buffers would help reduce physical I/O’s, the IAMINFO report will contain an IAM368 message indicating so. In fact, you can request that IAMSMF print off only those IAMINFO reports where that message appears, with the keyword ATTRIBUTE=MOREBUFFER. If this message is appearing for several data sets, then perhaps the BUFSP Global Option should be increased to avoid the need for several overrides.
For optimal I/O performance, use data compression. This will help reduce physical I/O’s, reduce virtual storage for the prime index, and reduce DASD space requirements. However, if CPU time is more of a concern, then don’t use data compression particularly on data sets with long records, or that are sequentially processed.
Make sure that heavily updated files are regularly reorganized. This will help prevent virtual storage problems, long open times, and high physical I/O activity.
Avoid the specification of Share Options 3 or 4 for IAM data sets, unless you have activated IAM/PLEX or IAM/RLS. The optional IAM/PLEX feature provides support for sharing files for update across multiple LPARs in a SYSPLEX, and IAM/RLS does the same for applications running within the same LPAR. Specification of those share options without IAM/PLEX or IAM/RLS will force additional physical I/O’s that can be substantial and will not maintain full data integrity.
Investigate increasing the block size for data sets with a large Prime Index structure, particularly if the data set has relatively large record sizes.

Buffering

IAM makes it easy to determine when more buffers could have reduced I/O by providing the IAM368 message in the IAMINFO report. Unless there is a concern about storage, there is no reason to be concerned about being overly aggressive at setting MAXBUFNO. IAM’s Real Time Tuning will carefully adjust the buffering for the data set as processing needs vary. For programs that do all sequential processing the maximum number of buffers used for the file will be the number of blocks per cylinder plus a few additional buffers to handle Extended Overflow blocks. Usually providing one or two tracks worth of buffers for overflow will be sufficient, unless a data set has a very extensive use of overflow. Using the default BUFSP Global Option setting will handle setting the defaults to maximize the buffering for sequential processing, eliminating the need to increase buffers for most batch jobs.

For programs that do all random I/O, a mix of random and sequential I/O, or short strings of sequential I/O requests, and then the MAXBUFNO value, should be increased by a quantity that you feel comfortable with. If you are not concerned about virtual storage usage or paging, then use a large quantity. However, if storage is a concern, increase the value by 4 or 8, and see how that helps. The methods of increasing MAXBUFNO for any particular file include:

Providing an IAM ACCESS MAXBUFNO or BUFSPACE override for the job step and data set.
Specify the BUFND parameter, either within the ACB or as part of the AMP parameter on the DD card for the data set, example AMP=(‘BUFND=nnn’). For CICS files not in a LSR pool, the resource definition for the data buffers will result in changing the BUFND value in the ACB that is used by CICS.
To increase buffering for all jobs that use this data set, provide an IAM CREATE MAXBUFNO override when the dataset is defined, loaded, or reorganized. The specified MAXBUFNO value will be applied whenever the data set is accessed.
Specify a value for BUFSPACE on the IDCAMS define control statement for the data set.

The two circumstances where you might not want to increase buffers for the job are as follows:

When the job is performing sequential processing against a data set that is concurrently open to online systems. This is because the batch job could end up dominating the file, resulting in poor response times for users of the data set on the online system. In fact, you will probably want to reduce MAXBUFNO for such jobs.
When the job has a virtual storage constraint. Refer to the IAM-Storage-Usage on Storage Usage for adjusting buffers with jobs that have storage constraints.

Otherwise, it is perfectly fine to increase the MAXBUFNO or BUFSPACE value.

Extended Overflow

Excessively large Extended Overflow usage can result in a deterioration of performance that can usually be avoided. These problems are avoided by the periodic reorganization of files when they are using some large quantity of extended overflow. VSAM data sets also require reorganizations due to performance deterioration and space usage. Because of that, many application job streams that were converted to IAM from VSAM already have regularly scheduled file reorganizations, which will generally be sufficient for IAM data sets. Depending on the dataset and application activity, the reorganizations may be done daily, weekly, monthly, or even quarterly.

Some of the symptoms of an excessively large usage of extended overflow are:

Long elapsed time to open the data set.
Excessive use of virtual storage or the IAM Index Space.
High I/O rates when processing the data set sequentially.
Potential inability to open the data set or other data sets due to virtual storage constraints.

It can be difficult to predict the level of extended overflow usage at which serious performance deterioration will occur. For example, one file could have over a million records in overflow, and not be experiencing any noticeable performance difficulties, whereas another data set may only have a few hundred thousand records in overflow and be experiencing severe symptoms. The key factors are the key length and the general placement of records in the overflow area. For example, if a file has a key length of 4 with a million records in overflow, the storage used for that index is going to be substantially less than if the file had a key length of 64. If the records in overflow are in a generally ascending key sequence, or in clusters of ascending key sequence, then the I/O impact and processing time to open the data set will most likely not be seriously impacted. A very random pattern of records throughout overflow can have a serious impact on sequential I/O performance, and the processing time to open a data set.

If there is a need for frequent reorganizations, then consider using such files with the Prime Related Overflow (PRO) format. The PRO format has a different structure to the overflow area that is indexed at the block level rather than the record level, generally resulting in a much smaller index and reduced frequency to reorganize the file. The use of PRO is recommended for files that have around a million or more records in overflow

One of the cautions is for reorganizations that are done by application programs. Some application programs reorganizations are done by a single record load followed by a mass insert. This is not a valid reorganization from the access method point of view. The resulting data set will frequently be in a less than optimum status after the application reorganization. If such a technique is used, the application reorganization should be followed by a file reorganization that is done by FDRREORG or an IDCAMS REPRO.

The other thing to watch out for on these scheduled reorganizations is, where within the batch job stream they occur. For example, some applications reorganize a data set after they are closed online, and then execute a large batch update process. The batch update process can result in heavy overflow use, so that when the data set is subsequently opened for online processing, it is in a less than optimum state. By simply scheduling the reorganization after the update processing, the file will be in the best possible organizational state when it is opened for online processing.

Guidelines for Reorganizing

Some guidelines for determining when an IAM data set should be reorganized include the following:

When more than 10 to 20 percent of the records in the data set are in extended overflow.
When the size of the Extended Overflow Index exceeds some storage quantity, such as 64 megabytes.
When the Overflow area exceeds a quantity of DASD space, such as 1,000 cylinders.
When a single volume data set is approaching sixteen extents.
When the number of overflow records for a particular data set approaches or exceeds a predetermined number of records. IAM can assist in monitoring this if the file is defined with an Overflow (O=) override of that specified value.

For some of the above guidelines, IAM will issue informational messages. These messages have been changed in Version 9.2, and are now the IAMW21, IAMW22, IAMW91 and IAMW92 indicating that reorganization is recommended along with the reason. The messages also use the z/OS multi-line format to easier identify the data set name with the message. This will make it easier for users that schedule reorganizations based on the IAM messages to be more selective in the files that they trigger the reorgs for. The IAMINFO reports will also include an IAM373 message indicating that reorganization is recommended. Several of these factors are available as selection criteria on FDRREORG, which provides an automatic method for reorganizing files only when needed. Other methods of automating file reorganizations include using the reports generated from the IAM SMF records, by IAMSMF or IAMSMFVS. Full information on IAM data set reorganizations is provided in Reorganizing-IAM-Data-Sets of the IAM space.

Large Prime Index

Data sets that have a prime index structure that exceed a few megabytes are considered to have a large prime index. The amount of storage required for the Prime Index, and whether or not it is compressed, is provided in both the IAMINFO reports and the IAMPRINT reports. Having a large prime index structure will not necessarily cause a performance problem, however, such files may realize improved performance by reducing the prime index size. There are a number of factors to consider. The potential advantages of reducing the prime index size include faster index search time and reduced virtual storage requirements. The reduction in virtual storage may be partially, or in some cases entirely, offset by an increase in buffer size if the block size is increased. The costs are increased search time for records within each data block and increased physical I/O time.

The prime index size is based on the number of prime blocks, the key length, and the compressibility of the key structure. From a tuning perspective, there are two possibilities for reducing the index size: improving the index compression and / or reducing the number of prime data blocks.

Controlling Index Compression

There are some parameters that influence the key compression for the index that are in the IAM Global Options Table, or also available as IAM Overrides. Basically, for any particular data set, IAM will take a fixed number of the high keys in each block, and compress them by elimination of each relative position in the keys that have an identical value. For example, if the first byte of each key is A, then A will be eliminated for that set. If the second byte of each key is B, then the second byte is eliminated from each key in the set. This process continues for each byte of the key, leaving only those byte positions where all the keys in the set do not have the same value. The number of keys in each set when the key length is less than 128 bytes is the lower of the number of keys that will fit in 256 bytes, or the value for LIMITKEYS. The default value for LIMITKEYS is 32.

For data sets with fairly long keys, there is the Long Key Compression option. Long Key Compression eliminates the limitation that a set of keys must fit in 256 bytes, and may provide improved index compression for files with long keys. The default minimum key length for Long Key Compression is 33 bytes. Regardless of the key size, if it is compressed with Long Key Compression then the number of keys in each compressed set is completely controlled by the LIMITKEYS value.

When using the Long Key Compression the value for LIMITKEYS can be adjusted on a file by file basis using the LIMITKEYS override. Some experimentation may be necessary to see if using a lower or higher value will help to improve the amount of index compression that can be achieved.

One can also change the Global Option value for when Long Key Compression is used, such as reducing it to 9.

Reducing the Number of Prime Blocks

You may have some control over the number of prime blocks. The first thing to make sure is, that the data compression is enabled for the data set. This can help reduce the number of prime blocks by fitting more data within each prime data block. The next factor to check is for a large CI free space area. Make sure that such a large CI free space area is warranted based on insert or record growth activity to avoid overflow growth. For most files providing CI free space results in wasted space. Reducing a large CI free space will result in fewer prime blocks. Next, if the file is at less than 1/2 track blocking, increasing the block size will reduce the prime index storage. Changing the block size requires some caution, unless the dataset is quite predominately sequentially processed. Random processing or short sequential browses that are typical of online systems may incur increased response times when using a larger block size, due to the increase in data transfer time. Plus they are also subject to increased CPU time to search the data block for the required record.

When to Increase the Block Size

So, when is it beneficial to increase the block size? There are two factors to consider. The first is the average record size as the data is stored, and the second is the benefit of buffering. As record sizes increase, there will be more benefit to increasing the block size providing that buffering is reducing physical I/O. The average stored record size is provided on the IAMINFO report for the file load. If that is not readily available, then the approximate value can be calculated with data from an IAMINFO or IAMPRINT report as follows:

Calculation for Approximate Average Record Length

Block size * ((100 - CI Free space) / 100)
_____________________________________________________________________________
(Total Records - Inserted Records + Deleted Records) / (Number IAM Data Blocks -2)

The benefit of buffering can be easily determined from data in the IAMINFO report by dividing the Disk Blocks Read by Requests Processed. This presumes that an adequate number of buffers are being provided. As this percentage of requests requiring I/O gets smaller, the benefit of buffering increases. The larger the benefit of buffering, the more likely it is to achieve benefit by increasing the block size. There may not be much benefit, from the physical I/O perspective if more than 50% of the logical requests require I/O. The I/O benefit is likely to be larger as the percentage drops to 25%, 10%, or even lower.

How to Increase the Block Size

As a general rule, if the average record size is 1K or more (1024 bytes), and there is some beneficial buffering, there should be no hesitancy about increasing the block size. The block size, or blocking factor, can be changed by either using the CREATE IAM Override B= during the file define, load, or reorganization to specify a block factor. The alternative is to increase the CI size on the Define statements. For example, specifying a B=2 override will force half-track blocking. A blocking factor of 1 is not recommended because there will be a considerable amount of DASD space wasted due to the limitation of the IAM data set block size of 32K.

Example of IAM Override to set 1/2 Track Blocking

//IAMOVRID DD *
CREATE DD=&ALLDD,B=2
/*

For data sets with smaller average record sizes, increasing the block size can be considered and will be beneficial with larger prime index structures as long as there has been beneficial buffering. There probably is not much benefit to increase the block size for files with average record sizes of less than 500 bytes, unless the I/O activity is predominately sequential.

High I/O Rates

This section will discuss some of the common causes of higher than expected physical I/O’s, commonly referred to as EXCP count. The IAMINFO report is a necessity to understand such a problem. The key statistical fields from the IAMINFO report that are used include the following:

DISK BLOCKS READ: The number of physical I/O’s (EXCP’s) that were issued to read data from the IAM data set.
DISK BLOCKS WRITTEN: The number of physical I/O’s (EXCP’s) that were issued to write data to the IAM data set.
SEQ CHAINED BLOCKS READ: The number of additional data blocks read in as part of a sequential I/O. This number plus the DISK BLOCKS READ is the total number of blocks read into storage.
SEQ CHAINED BLOCKS WRITTEN: The number of additional data blocks written out as part of a sequential I/O. This number plus the DISK BLOCKS WRITTEN is the total number of blocks written out to DASD.

The total EXCP count for the IAM data set can be easily calculated by adding the DISK BLOCKS READ and DISK BLOCKS WRITTEN values. It is quite useful to have the two separate values, as they will help in our search for what is going on with the data set. Some of the circumstances and potential actions are described below.

If the value for Disk Blocks Written is very high, then most likely what is happening is, that IAM is not deferring the writes for random updates. This situation occurs when the data set is defined with Share Option 3 or when a Share Option 1 or 2 data set is processed asynchronously, as is done by CICS. For online systems, this generally is a desired action so no change is recommended. For datasets defined with Share Option 3, they can be redefined with Share Option 2 because of the very high risk associated with sharing an IAM data set for update.

If both the Disk Blocks Written and Disk Blocks Read are very high, such that they equal or exceed the total requests, the most likely cause is that the file is defined with Share Option 4, unless IAM/PLEX or IAM/RLS is active for this data set. Setting Share Option 4 forces IAM to use only 1 buffer, and IAM will always reread a data block whenever it is requested, even if it is already in the buffer. In addition, IAM will always immediately write out any updated data block, including sequentially updated data blocks. The data set should be redefined with a Share Option of 2, because sharing an IAM data set for update is most likely going to result in a corrupted data set, and lost data.

If both Disk Blocks Read and Seq Chained Blocks Read are exceedingly high, the problem is most likely that IAM is rereading empty prime or PE blocks. This can result due to an application having mass deleted a large group of records that occupied contiguous blocks, followed by attempts to retrieve records using a key greater or equal type of search. Depending on the Share Options and how the data set was opened, IAM is able to avoid this type of processing. The affected data set should be reorganized to resolve the problem. Try using the REREADEMPTY=NO IAM ACCESS override which may prevent the high I/O rate.

If Disk Blocks Read is quite high for a basic sequential I/O type of job, then the most likely cause is that there are a lot of records in key sequence that are scattered through many different Extended Overflow blocks. Such a situation is also likely to be accompanied with a long time to OPEN the data set, due to the Extended Overflow index build process. The solution to this problem is to reorganize the data set.

Using Multiple Volumes for Performance

For data sets that have an unusually high I/O activity and are not on a device with PAV (Parallel Access Volume), it may be quite beneficial to spread the data set across multiple volumes. By doing so with Enhanced Format IAM data sets, there can be concurrent physical I/O scheduled to each DASD volume, which may result in significantly improved online response times. With a little bit of planning, this is easy to accomplish by setting up proper space allocation parameters. Two different techniques for accomplishing this will be shown. For both examples, it has been determined that the dataset requires approximately 2,000 cylinders of space, excluding overflow requirements. The bulk of the data set will be split across 4 DASD volumes, however a fifth volume will be used to handle any potential growth into the IAM Extended areas.

The first example can be used for installations that have DFSMS active on their system.

Note

The data set does not have to be SMS managed for this technique to work, just have to have DFSMS active.

If the data set is going to be on SMS managed volumes, then the data set must be defined with Guaranteed Space. If the data set is being allocated to non-SMS managed volumes, then IAM allocates the data set as if it were being defined with Guaranteed Space under DFSMS. This means, that IAM will allocate the primary space quantity on each volume when the data set is defined. For this technique to work, the secondary space quantity must be 0, which will prevent the usage of secondary extents. File expansion is accommodated by utilization of the space on the fifth volume.

Example of Spreading IAM Data Set across Multiple Volumes

//DEFMULTV EXEC PGM=IDCAMS
//SYSPRINT DD   SYSOUT=*
//SYSIN     DD   *
      DEFINE CLUSTER    -
   (NAME(MY.IAM.KSD) -
  OWNER($IAM) -
   VOLUMES(MYVOL1 MYVOL2 MYVOL3 MYVOL4 MYVOL5) -
   CYL(500)   RECORDSIZE(100 1000) -
   KEYS(24 8) FREESPACE(5 20) -
   SHAREOPTIONS(2 3) REUSE )
    LISTCAT ENT(MY.IAM.KSD) ALL
/*

In the next example, a different technique is used where data set will be allowed to take secondary extents. This is effective for files that are not SMS Extended Format files. To achieve the desired split of 500 cylinders across 4 volumes, a primary of 200 cylinders is being requested, with a secondary of 20 cylinders. The secondary results in a total of 300 cylinders, being 15 extents times 20 cylinders. The IAM overrides of MAXSECONDARY=1 is specified to prevent IAM from increasing the secondary allocation, and an override of MULTIVOLUME=PRIMARY is specified to cause IAM to allocate the primary on the next volume for the first extent.

Example of Spreading IAM Data Set Across Volumes

//DEFMULTV EXEC PGM=IDCAMS
//SYSPRINT DD   SYSOUT=*
//IAMOVRID DD   *
  CREATE DD=&ALLDD,MAXSECONDARY=1,MULTIVOLUME=PRIMARY
/*
//SYSIN     DD   *
       DEFINE CLUSTER    -
    (NAME(MY.IAM.KSD) -
   OWNER($IAM) -
    VOLUMES(MYVOL1 MYVOL2 MYVOL3 MYVOL4 MYVOL5) -
    CYL(200 20) RECORDSIZE(100 1000) -
    KEYS(24 8) FREESPACE(5 20) -
    SHAREOPTIONS(2 3) REUSE )
     LISTCAT ENT(MY.IAM.KSD) ALL
/*