Buffer pools
DB2 includes a built-in caching mechanism for data pages that are read from DASD to satisfy SQL statements from applications.
This cache is referred to as buffer pools. Database administrators typically control the size of the pools and decide which data page objects reside in which pool. Early releases of DB2 supported three buffer pools for 4-KB data pages and one buffer pool for 32-KB data pages.
Problem
As their customers moved more heavily into relational processing, IBM addressed mounting performance demands with a variety of I/O avoidance techniques, such as the various types of prefetch. It became obvious that radically different types of data that were accessed in different manners did not coexist well in the same buffer pool, so IBM increased the number of 4-KB pools to 50 and the number of 32-KB pools to 10 and added 10 8-KB and 10 16-KB pools. Beginning with DB2 v8.1, 64 bit memory architecture has allowed the buffer pool sizes to increase dramatically. Although this vast increase in caching resources has been available for years, most customers still use relatively few buffer pools.
DB2 system programmers and DBAs have found the task of buffer pool tuning complicated and difficult to understand and, with the advent of MVS sysplex configurations and DB2 data-sharing, the task has become even more complex. Shared data has to be accessed via 'group' pools so that changes can be coordinated across MVS images. How data is assigned to these buffer pools can make a big difference in the performance of the data-sharing complex.
Another important factor is the trend among large companies towards large third-party vendor application suites such as PeopleSoft and SAP. A defining characteristic of these packages is that they use large numbers of data tables (from thousands to tens of thousands). The job of configuring buffer pools for optimum performance and use of storage resources when confronted with such massive numbers of data objects is daunting.
Faced with these circumstances, every DB2 DBA asks the same questions:
- How many buffer pools should I have?
- How big should they be, and how should I set their thresholds and parameters?
- How big should they be, and how should I set their thresholds and parameters?
- How big should they be, and how should I set their thresholds and parameters?
- Which objects should be grouped together, and into which pools?
- How should the buffer pool thresholds be changed through the day?
- How should the page set buffer pool assignments change over time?
How these questions are answered can have a dramatic impact on the performance of corporate database applications.
The problem with placing all DB2 data objects into one big buffer pool (which is not uncommon) is that these different data objects have varied and distinct access characteristics. These different characteristics tend to interfere with each other, often resulting in poor performance and inefficient resource use. Different data objects (with widely varying access characteristics) in the same pool can suffer complex interactions, resulting in some types being forced out of the pool excessively and driving down overall buffer pool effectiveness. The generally accepted way to deal with this problem is to classify the data objects and combine those with the same characteristics into their own pools.
The following figure illustrates a typical buffer pool configuration:
- Few buffer pools
- Approximate object separation
- Unlike objects sharing the same pool
Typical buffer pool configuration
The following figure illustrates a better configuration, with more buffer pools and more segregation of data into similar performance classes.
Efficient buffer pool configuration
By grouping objects with similar access and priority characteristics into multiple buffer pools, data pages are shared more equitably in each pool and overall hit ratios are improved. In addition, the size of each pool can now be varied individually to achieve caching targets by object group according to priority and workload demand.
While this process is straightforward in principle, two primary factors have limited its use:
- The data that is required to classify and group similar objects is not readily available. Expensive, high-volume buffer traces are required to collect the data.
- Data objects at a typical DB2 site now number in the thousands or tens of thousands.
The effort required to implement this straightforward strategy is too great in most IS shops today. Consequently, little real buffer pool tuning is being performed.
Solution
Pool Advisor can perform object classification far better than any comparable product available today.
Pool Advisor can process large lists of buffer pools and data objects. It more accurately groups data types by considering more performance characteristics than most DBAs can afford to consider:
- Type of object (table or index)
- Degree of sequential access versus random access
- Activity level
- Object size
- Update rate
- Optimum working set size for best hit rates
- Priority
The Configuration Advisor calculates a "score" that represents the overall fitness of the current configuration for the measured attributes of the page sets. You can then submit an analysis request that evaluates the mix of pools and objects and recommend changes to the assignment of objects to pools and the attributes of the pools. You have the option to accept and implement those changes or modify and resubmit the analysis request.
Using this iterative and interactive approach, you can reach a configuration that meets the needs of your objects, while fitting within acceptable limits of complexity and resource use.