Command failures


Command failures can occur when TOM issues a START or a STOP command for a defined object.

When TOM issues a START or a STOP command for an object, it waits for a user-defined amount of time (or a default of one minute) during which, it waits for start/stop/ABEND validation events. You do not have to supply events as part of an object’s definition. When you do not specify an event for an STM object, TOM uses the object's initiation or end of memory events for evidence of a successful initiation (or termination) of the object.

If TOM does not receive notification of this event before the specified time limit is exceeded, processing for this object goes into recovery mode.

To perform recovery of a failed command, TOM uses the list of optional retry commands that you can define as part of each object’s definition. TOM issues the retry commands until they are all issued or TOM is notified that the event has occurred.

When no retry commands are specified or all retry commands are exhausted, the object’s status is set to

  • FAILURE-REC-INIT for objects that did not start
  • FAILURE-REC-TERM for objects that did not stop

The following factors can determine whether a retry command is issued and how it is issued:

  • Schedule dependency
    TOM selects the first retry command that is defined for the object. Each retry can be defined with a schedule dependency, which specifies when a command can be issued. If the time that is specified in the dependency definition indicates that the command is not to be issued at this time, TOM moves to the next command in the list. If a retry command is not defined with a schedule dependency, TOM issues the retry command.
  • Command count
    Each START or STOP RETRY command has a command count attribute and a command interval attribute. The count indicates how many times the command can be issued. The interval controls how long to wait between issuing commands. The task that issues the command waits either until TOM receives the specified event (object is now ACTIVE for a start, STOPPED, LOCKED, or BLOCKED for a stop) or until the waiting period expires.

    When the wait period is exceeded, TOM checks the command count attribute. If the command count has not been exceeded, TOM continues to issue the RETRY command. When the count is exceeded, the counter is reset, and the next command is issued.

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*