Using the ANALYZE option in REORG PLUS to estimate data set allocation
If you specify ANALYZE PAUSE or ANALYZE ONLY, REORG PLUS gathers information about the objects that you are reorganizing.
In addition to cardinality and average row size, the ANALYZE phase provides estimated data set sizes for the following data sets:
- Unload (SYSREC)
- Work (SYSUT1)
- Sort (SORTWK)
- Full image copy (BCPY, BCPZ, BRCY, and BRCZ)
- Incremental image copy (BICY, BICZ, BIRY, and BIRZ)
Estimates provided by the ANALYZE option
The following table details the space estimates provided for both table space and index reorganizations and provides estimates for both single and multiple SYSUT1 and SYSREC data sets. Refer to the specification guidelines for each data set provided in the preceding pages to determine whether to specify single or multiple SYSUT1 and SYSREC data sets.
For both optimum and minimum sort work file estimates, REORG PLUS uses the largest index to determine estimates. The optimum value is either the space required to sort the largest task (the task unloading the most data) or the space required to sort the largest index, whichever is greater. The space required for the task that unloads the most data can always be determined by sampling.
Data sets for which estimates are provided | Table space reorganization | Index reorganization | Information provided |
---|---|---|---|
Single SYSREC data set | Yes | Not applicable | Provides an estimate for all table space reorganizations except for a single-phase reorganization of a partitioned table space |
Multiple SYSREC data sets | Yes | Not applicable | Provides an estimate for each partition that you are reorganizing in a partitioned table space only |
Single SYSUT1 data set | Yes | Yes | For a table space reorganization, provides an estimate for all non-data-sorting indexes and includes any indexes being created When you specify ORDER NO, the estimate includes the clustering index. For an index reorganization, the estimate is for the index that you are reorganizing. |
Multiple SYSUT1 data sets | Yes | Not applicable | Provides an estimate for each non-data-sorting index, including a non-data-sorting index being created. If you specified ORDER NO, ANALYZE provides an additional value for the clustering index, including a clustering index being created. |
SORTWK data sets | Yes | Yes | Provides two estimates, an optimum value and a minimum value Each estimate is the total for all SORTWK data sets. Divide this value by the number of SORTWK data sets to get the individual data set sizes. ANALYZE provides the estimates only when a sort will be performed. |
Single full or incremental image copy data set (BCPY, BRCY, BICY, and so on) | Yes | Not applicable | Provides an estimate for single copy data sets when you are performing:
|
Multiple full or incremental image copy data sets (BCPYnn, BRCYnn, BICYnn, and so on) | Yes | Not applicable | Provides an estimate for each partition that you are reorganizing in a partitioned table space only |
REORG PLUS provides the estimated information in table format. Messages BMC51260I and multiple BMC51263I messages provide the estimates. A separate BMC51263I message for each data set provides the following information:
- Data set name
- Number of kilobytes
- Primary and secondary 3380 cylinder quantities
- Primary and secondary 3390 cylinder quantities
- Index name, where applicable
Considerations
The following considerations apply to the estimates that ANALYZE provides:
- REORG PLUS cannot take into account rows bypassed with SELECT or DELETE.
- For several reasons, including rows that contain VARCHAR columns and tables that contain EDITPROCs, ANALYZE might report a secondary quantity for SYSREC that is too large. The reason is that the primary quantity is based on the average row length, and the secondary quantity is based on the maximum row length from the DB2 catalog. In this instance, BMC recommends that you provide a secondary quantity of approximately 25 percent of the primary quantity.
- REORG PLUS writes these statistics to SYSPRINT. For information about the other statistical information messages that the ANALYZE phase issues, see ANALYZE-messages.
- If you specify ANALYZE ONLY and use the information to allocate your data sets, you can improve performance by changing the REORG command options to ANALYZE HURBA when you rerun the job. Specifying ANALYZE HURBA bypasses the ANALYZE phase. For the list of restrictions when using HURBA, see HURBA.
- As an alternative to using ANALYZE PAUSE or ONLY to estimate sizes for data set allocation, you can have REORG PLUS dynamically allocate your data sets for you. To use dynamic allocation, specify ANALYZE (without PAUSE or ONLY). You must also have dynamic data set allocation active, either in your installation options or with the DDTYPE command option.
- If you do not use the PAUSE or ONLY keywords with ANALYZE, REORG PLUS also gathers the information described in this section. However, instead of pausing or stopping, REORG PLUS continues processing. If dynamic allocation is enabled, REORG PLUS uses the ANALYZE phase information to dynamically allocate your data sets. In this case, the ANALYZE phase does not write the statistics to SYSPRINT.
Using ANALYZE with compressed table spaces
REORG PLUS uses the compressed row length to determine the size of the SYSREC and SORTWK data sets whenever possible.
SYSREC data set
REORG PLUS estimates the size of the SYSREC data set in the following manner:
- For compressed table spaces, REORG PLUS uses the average compressed row length.
- For noncompressed table spaces, REORG PLUS always uses the actual row length.
- For a multi-table table space, REORG PLUS averages the row length for the various tables.
The following table describes whether REORG PLUS uses compressed or expanded rows when KEEPDICTIONARY is in effect.
Type of reorganization | KEEPDICTIONARY value | Row length used |
---|---|---|
Single phase | YES | Compressed |
NO | Expanded | |
Two phase | YES | Compressed |
NO | Compressed |
SORTWK data set
When estimating the size of the SORTWK data sets, REORG PLUS uses the average compressed row length only if all of the following criteria are true for a table or for all partitions of a table space:
- The value of the KEEPDICTIONARY command or installation option is YES (or is implied, as when you do a single-phase SHRLEVEL REFERENCE or SHRLEVEL CHANGE reorganization with ORDER NO).
- You did not add new columns to the table.
- You did not specify AMEND YES for the EDITPROC for this table.
- You did not specify UPDATE on the REORG command for the table.
- The table belongs to a table space with the COMPRESS YES attribute, or all of the partitions of the table space have the COMPRESS YES attribute.
For a partitioned table space, if only some of the partitions meet the preceding criteria, REORG PLUS uses the expanded row length to calculate the SORTWK data set size for all of the partitions.
For a multi-table table space, REORG PLUS uses the
- Compressed row length for each table that meets all of the preceding criteria
- Expanded row length for each table that does not meet the criteria
REORG PLUS then averages the row lengths to achieve the estimated data set size.
Related topic