IAM Data Compression


Advantages

Data compression reduces DASD disk space usage and reduces both the amount of data transferred over the I/O subsystem which reduces both physical I/O time and the amount of physical I/O. IAM can compress the data in KSDS, ESDS, RRDS and AIX files. IAM compression is completely transparent to the programs that create and use those files. Customers can choose to use either IAM software compression, or they can request that IAM use the z/Architecture hardware compression instruction. With hardware compression, you can choose to either have IAM dynamically build the compression dictionary during the file load, or you can build a customized dictionary prior to actually loading the data set. The primary advantage to building a customized dictionary is that it will provide the best amount of space savings from data compression.


For software compression, IAM uses a proprietary algorithm that is optimized for minimal CPU processing requirements that also provides good space savings for most files. IAM's software data compression does not rely on a compression dictionary which reduces the exposure to potential data loss. While it is possible for alternative data compression algorithms to achieve a greater amount of record size reduction, the CPU time to achieve such results can be excessive. In fact, for many files, IAM's software compression CPU time is still less than VSAM's CPU time without data compression. Also, because of VSAM's use of generic compression tables, IAM for many files, is still able to achieve similar or better space savings than VSAM hardware compression.

IAM’s CPU time with hardware compression will be less than that of VSAM's, and with customized compression dictionaries, IAM may achieve greater compression. Unlike the IAM software compression, the IBM hardware compression requires the use of a compression / decompression dictionary. IAMwill generate a dictionary using up to eight megabytes of data from the first records initially loaded into the dataset. The intent is to provide a decent level of compression while minimizing the CPU time used to generate the dictionary. Users can build their own compression dictionaries using a more extensive process that will likely yield much greater compression for most datasets. Instructions on creating customized compression dictionaries are in Using-hardware-compression of the IAM space.

The default compression technique, of either Hardware or Software, can be set in the IAM Global Options Table by the COMPRESSTYPE option. The default, as IAM is shipped is Software Compression. Data compression will increase the CPU time to process a file than without any compression. When using data compression, the consideration is that the hardware compression will use more CPU time on the file load or reorganization than the software compression, however hardware compression will use slightly less CPU time for normal dataset access. For files that are very infrequently loaded or reorganized hardware compression will generally be using less CPU time.

Eligibility for data compression

IAM considers KSDS and AIX type of files that are 75 tracks or larger, and all ESDS and RRDS files as candidates for data compression. Setting the IAM DATACOMPRESS Global Option can easily change the file size that determines eligibility for IAM Data Compression. The automatic setting of IAM's Data Compression can always be altered through the use of the IAM Override facility.

To be eligible for IAM Data Compression, a file must be defined with a maximum record size that is at least 10 bytes more than its key length plus the relative key position (RKP). If Data Compression is enabled for a file, IAM will only compress individual records when the data following the record key exceeds ten (10) bytes in length. If compression would make a record larger than the original, IAM leaves the record uncompressed. Subsequent updates to an uncompressed record will keep the record uncompressed.

Data Compression can be used on all your IAM files. If a particular file is found by IAM to be uncompressible, there is no penalty in CPU time to process that file after the load. It is as if compression had never been requested for that file. There may also be a few files that just do not show much of a benefit from data compression. For example, SMP/E CSI files have an average record length that is just a bit larger than their key. When there is not much data to work with, there is little data compression can do to reduce a file's size. In these cases IAM's Data Compression may show little saving, beyond the space reduction that comes with simply converting to IAM. If a specific file shows only marginal compression there will likewise be only a marginal increase in IAM's CPU time to process that file.

Backup compressed data

IAM offers the capability to backup and reload software compressed data within an IAM file without decompressing or compressing the data. For large files, this is anticipated to allow IAM files to be backed up and reorganized faster than can be done today. Even when the data is compressed by the tape control unit, there is still the overhead of transferring all that data to the controller. With this new feature, both the CPU overhead and that I/O overhead is eliminated. The FDRREORG product from BMC will automatically use this IAM feature.

The backup and reload of software compressed data is specified for other programs, such as IDCAMS, by the use of the IAM Override facility. The override will have to be specified on both the backup and reload process, because IAM needs to know to not decompress the data on the backup side, and that the data is already compressed on the input side. Simply specifying the keyword BACKUPCOMPRESSED on the ACCESS and CREATE IAM overrides does the job. IAM adds four bytes to each record when performing this function, so any output file created will have to contain either variable (RECFM=VB) or undefined (RECFM=U) type of file. For variable output files, the record length for the output file (LRECL) will need to be at least 8 bytes more than the defined maximum record length for the file. For example, if the maximum record length for the file is 100, then the output LRECL must be at least 108. For undefined type of records, the maximum LRECL is 104, only 4 bytes more than the file maximum record size. We recommend using RECFM=VB type of output to provide the best output device utilization.

Data in an IAM data compressed format on tape can be easily converted to an uncompressed format. Either reload the data with BACKUPCOMPRESSED into an IAM file or use the IAMRECVR DECOMPRESS command to make a sequential copy of the data set with uncompressed data.

Important

You will require the original key length and key offset (RKP) to perform the DECOMPRESS function. IAM also provides a callable interface to read and perform the decompression from a data compressed sequential data set that can be used by application programs.

For examples of using the BACKUPCOMPRESSED feature and using FDRREORG to reorganize IAMdata sets, see Reorganizing-IAM-Data-Sets

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*