Distribution of work

The High-speed Apply Engine uses two basic techniques to balance the workload for an apply request between multiple agents: distribution by object (table or partition) and distribution by unit of recovery (UR).

High-speed Apply Engine uses default values to choose between these techniques. To override the defaults and optimize distribution between agents for your environment, use configuration parameters. For more information about default values and which techniques take precedence over others, see DistributionTuning-parameters.

Distribution by object

The High-speed Apply Engine can distribute work between agents based on the tables or partitions that the SQL statements in the input file update.

High-speed Apply Engine uses the same agent to process all SQL statements that affect the same table or partition, regardless of what units of recovery contain the statements (unless High-speed Apply Engine regroups tables because of RI relationships). High-speed Apply Engine offers several options for this type of distribution (for more information, see Options for distribution by object). High-speed Apply Engine might also use the same agent for all tables in RI relationships (subject to parameters that you can adjust).

Advantages of distribution by object: High-speed Apply Engine can increase parallelism and use multiple agents more efficiently. This type of distribution frequently provides faster performance.
Disadvantages of distribution by object: High-speed Apply Engine provides data integrity for the overall apply request, but does not preserve transaction integrity during execution. With this technique, High-speed Apply Engine does not honor the original COMMIT or ROLLBACK statements (High-speed Apply Engine commits work based on other parameters that you can define). You can encounter more conflicts and you might need to define more conflict resolution rules.

Be aware of the following points regarding distribution by object:

Use the [CommitTriggers] parameters to control when High-speed Apply Engine commits work. The values of these parameters greatly affect performance.
To request that High-speed Apply Engine not honor database-defined RI relationships between tables or partitions, use the RIClustering parameter.
Distribution is based on the base table when an alias or single table view is used to identify the table_name in the SQL statement (for example, INSERT INTO table_name (col1, col2) values ('xxx', 'yyy'); ).

Options for distribution by object

High-speed Apply Engine provides the following options for distribution by object:

Groups of tables
High-speed Apply Engine distributes work between agents based on groups of tables. You can define your own groups or use existing groups that are defined within Db2. High-speed Apply Engine uses the same agent to process all SQL statements that affect the tables in a group. High-speed Apply Engine supports the following techniques to define groups of tables:
- Groups of tables that are defined by users
  Use the [ObjectCluster] parameters to define your own groups of tables. This option provides the most flexibility; you can respond to changing data distribution by changing table groups for each run of an apply request.
- Groups of tables that are defined by RI relationships
  By default, High-speed Apply Engine groups tables based on database-defined RI relationships (subject to parameters that you can adjust). For more information, see RIRecursionLimit. This type of distribution preserves RI for all updates that the apply request make. Use the RIClustering parameter to avoid this grouping. For more information, see RIClustering.
- Groups of tables that are defined by simple table spaces
  By default, High-speed Apply Engine includes all tables in a simple (nonsegmented) table space in the same group and uses the same agent to process all SQL statements for those tables. Use the SimpleTSClustering parameter to avoid this kind of table grouping. For more information, see SimpleTSClustering.

Partitions
High-speed Apply Engine distributes work between agents based on table partitions. You can define your own partitions for a nonpartitioned table, tell High-speed Apply Engine to use partitions that are different than the existing partitions of a table, or use the existing partitions of a table. High-speed Apply Engine uses the same agent to process all SQL statements that affect a partition that you define or use.
High-speed Apply Engine supports the following techniques to define partitions:
- Partitions that are defined by users
  Use the [ObjectPartition] parameters (ObjectPartition-parameters) to define your own partitions for nonpartitioned or partitioned tables. (This process is sometimes referred to as logical partitioning.) This option provides the most flexibility; you can respond to changing data distribution by changing your partitioning for each run of an apply request.
- Partitions that are defined by partitioned table spaces
  Use the PartitionClustering parameter (PartitionClustering) to direct High-speed Apply Engine to use the same agent to process all SQL statements that affect the same physical partition of a table.
Individual tables
High-speed Apply Engine uses the same agent to process all SQL statements that affect an individual table in your input file. High-speed Apply Engine defaults to this option only when
- You do not activate any of the other options for distribution by object
- The tables in your input file are not part of database-defined RI relationships (or you disable RI clustering)
- The tables in your input file are not stored in a simple table space (or you disable simple table space clustering)
- The tables in your input file are not partitioned tables

Distribution by unit of recovery

The High-speed Apply Engine can distribute work between agents based on the units of recovery (transactions) contained in the input file.

High-speed Apply Engine uses the same agent to process all SQL statements in an original transaction, regardless of what tables or partitions the statements update. (Two different transactions that update the same tables might not be processed by the same agent.) High-speed Apply Engine accommodates dependencies between transactions and preserves the order of the original transactions for restart purposes. For more information, see DistributionType.

Advantages of distribution by unit of recovery (UR): High-speed Apply Engine preserves the original transactions and honors the original COMMIT or ROLLBACK statements. This preserves transaction integrity during an apply request. High-speed Apply Engine performs faster when there are few dependencies between transactions (for example, when the transactions do not update the same tables).
Disadvantages of distribution by unit of recovery: Performance can be slower when there are dependencies between transactions, because the different agents wait more frequently to accommodate the dependencies and preserve the original order of transactions.

Use this type of distribution when transaction integrity during the apply request is more important than speed. Be aware of the following points regarding distribution by UR:

Under distribution by UR, configuration parameters that are related to distribution by object do not apply (such as [CommitTriggers], [ObjectPartition], and many [DistributionTuning] parameters, such as PartitionClustering and RIClustering).
Under distribution by UR, High-speed Apply Engine is more likely to use the maximum number of agents that are available (depending on the number of transactions in your input and the dependencies between them). Each agent requires a separate connection to the database (for Db2 on mainframe targets, a separate thread on the Db2 subsystem). Use the MaxAgents parameter to prevent High-speed Apply Engine from using more database connections, and by extension, more resources than your environment can tolerate. For more information, see MaxAgents.
You can control how High-speed Apply Engine processes referential integrity (RI) dependencies between transactions. When your input contains many RI dependencies, consider using the URCheckRI parameter (for more information, see URCheckRI). When the input contains fewer RI dependencies, consider using a conflict resolution rule with Code parameter value RIConflict and Action parameter value Retry.

Distribution of work

Distribution by object

Options for distribution by object

Distribution by unit of recovery

On this page