Recovery from database Deadlock/Timeout errors


Occasionally DB2 issues the SQL(-911) SQLCODE or IDMS issues an ERROR-STATUS 29, meaning that while attempting to service the request issued by the application program, the DBMS encountered a “lock” on a resource (probably some data) needed to satisfy the request. DB2 issues an internal “ROLLBACK”, such that DB2 Tables are returned to their state at the time of the last DB2 “COMMIT”.

Depending on the behavior of the application program following the failed statement this can be a very dangerous and awkward situation.

For example, if the application program chooses to “loop” until the needed DB2 resource is available by periodically re-issuing the same SQL statement, there is the danger that if DB2 should later successfully obtain the necessary resources, the processing negated by the DB2 internal “ROLLBACK” following the SQL(-911) may never be processed - thus being “dropped”. In addition, if the application had been tallying counters or totals within its Working Storage, there is the potential for those counters and totals to represent data which was, logically, never encountered, due to the “ROLLBACK”.

In addition, if there were any open files at the time of the SQL(-911), they could also be logically unsynchronized with respect to DB2 Tables following the “ROLLBACK”.

In most technical manuals, the recommended action to be taken following an SQL(- 911) is to ABEND the program so that all entities are synchronized at the last checkpoint. (If the program is not restartable, then a database restore is usually required, followed by the re-running of all jobs which executed after the database backup.)

However, this practice lends itself to numerous and frequent program ABENDs. As more application programs begin using DB2, more failures of this type occur. Frustrations begin building among Programmers, DBAs and Managers alike.

QUICKSTART offers the application program the opportunity to avoid these problems by providing a method which allows for retries in safety.

Upon encountering an SQL(-911) an application program using QUICKSTART is free to “loop”, if desired, instead of issuing an ABEND. In Transparent Mode, this will be done automatically if program QSATTACH is specified on the EXEC PGM= JCL card. In API Mode, the only modification the application program must perform is to issue an “END” with “ROLLBACK” request to QUICKSTART (MOVE 'R' TO CKPTROLLBACK, MOVE 'E' TO CKPT-REQUEST-TYPE). QUICKSTART effectively issues a “ROLLBACK” of the application's Checkpoint Area and its sequential files, thus eliminating the synchronization dangers described above. This results in fully  synchronized DB2 Tables, Checkpoint Area and sequential files, giving the application program the opportunity to “loop” instead of imperatively issuing an ABEND following an SQL(-911). The application program must then resume processing at the start of the program.

Note

DO NOT move spaces to CKPT-ROLLBACK before resuming at the start of the program. Also, there is no need to move 'S' to CKPT-REQUEST-TYPE before resuming at the top of the program.

If you are using FD interception (QSFD) with -911 support, you must issue CALL QSCLOSER (or QSCLRADV if file was opened with QSOPNADV) verbs for each of the files before returning to beginning of the program. DO NOT use the QSCLOSE copybooks to perform these CLOSEs, simply issue COBOL CLOSE verbs.

The following is an example of the necessary application program coding changes:

EXEC-SQL
  SELECT
    .
    .
    .
  END-EXEC.


IF SQLCODE IS LESS THAN ZERO


    IF SQLCODE IS EQUAL TO -911


* CLOSE any COBOL FD's used with QUICKSTART's
* FD Interception feature in both the main and
     sub-programs COPY QSCLOSER REPLACING MYFD BY
     fdname.


     MOVE 'E' TO CKPT-REQUEST-TYPE
     MOVE 'R' TO CKPT-ROLLBACK


     PERFORM CKPT-RTN


     GO TO (the start of the program)
    END-IF
END-IF.

Related topic


 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*