Sampling statistics


Sampling large objects makes BMCSTATS run much faster than when gathering complete statistics. The smaller the sample, the faster BMCSTATS runs.

Minimum size requirements

If you specify TSSAMPLEPCT  percent or IXSAMPLEPCT  percent, BMCSTATS scans all space maps and control pages, and randomly samples the percentage of data pages unless the object has fewer than 1000 pages.

If the object has fewer than 1000 pages for sampling, BMCSTATS turns sampling off and reads all data pages. If you need greater precision for index statistics, specify a percentage higher than the default, as in the following example:

Example
TSSAMPLEPCT 25
IXSAMPLEPCT 50

BMCSTATS does not sample objects that contain fewer than the specified number of 4-KB pages. DASD MANAGER PLUS enforces a minimum of 1000 pages regardless of page size.

For multi-table table spaces, the numPages/numTables must also be greater than the minimum pages listed.

After scanning all space maps, BMCSTATS sampling performs the following tasks:

  1. It averages the statistics values for the space map pages.
  2. It multiplies the average by the total number of pages.

The two operations calculate statistics at the page level and row level, providing values such as CARD, FARIND, and NEARIND.

While sampling indexes reduces resource consumption, restrictions affect when you can specify this option. In the following instances, DASD MANAGER PLUS suppresses index sampling and processes the entire index:

  • Specifying an index sampling option in addition to UPDATEDB2 Y

    To eliminate this restriction, you can specify the UPDCATIXS=Y installation option during installation to allow the catalog to be updated with index sampled statistics.

    Warning

    Index sampling might produce statistics that cause SQL optimizer access selection problems. We recommend that you review the index sampled statistics for your indexes before deciding to use them for updating the catalog.

  • Specifying an index sampling option and an option to collect histogram statistics for indexes

    When collecting histogram statistics, BMCSTATS invokes DSNUTILB (which also does not support index sampling).

  • Using an index that has fewer than 1000 pages

Specifying sampling percentages

When you specify a percentage for a value in the panels, 25 percent is conservative and 10 to 15 percent might be adequate.

To process all pages, remove the TSSAMPLEPCT keyword.

Restricting sampling by object type

You can further restrict the sample percentage by object type.

When you specify the SAMPLE Y keyword, sample processing applies to whatever object type is currently being processed. When processing both table spaces and indexes, you can use SAMPLE T or SAMPLE I to restrict sampling by object type (T for TABLESPACE and I for INDEX) with 25 percent sampling. Entering a percent (from 1 through 50) on the panel for Sample Tablespaces or Sample Indexes on the BMCSTATS panel generates TSSAMPLEPCT and IXSAMPLEPCT, respectively.

Sampling table columns

Sampling is faster on large objects than gathering complete statistics.

Sampling on table columns estimates values by using probability. For greater accuracy, do not use sampling.

Sampling by using a seed value

BMCSTATS uses random pages to sample at the page level unless you use the REPEATABLE clause.

If you specify REPEATABLE with an integer, BMCSTATS uses the integer as the seed value for the sample. Consequently, the product samples the same pages, and you can perform the following operations:

  • Rerun BMCSTATS with the same REPEATABLE value
  • Produce statistics based on the same pages

If you do not specify REPEATABLE, BMCSTATS uses a random seed value for each run. Consequently, the product might produce different table space sampling statistics every time you run BMCSTATS.

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*