This documentation applies to the 8.1 version of BMC Atrium Core, which is in "End of Version Support." You will not be able to leave comments.

To view the latest version, select the version from the Product version menu.

Monitoring reconciliation jobs using the Fail safe feature

The Fail safe feature in the reconciliation engine monitors the progress of all the running jobs (scheduled/ continuous / non continuous). When it comes across a non-responding job, it automatically restarts the job.

For the non-continuous jobs, this feature stops the current run and starts a new run. For continuous jobs, it stops the current run and lets the next run start when the continuous interval of that job has elapsed.

For example, if the job idle time is configured to 60 minutes; the Fail safe feature monitors all the jobs after every 60 minutes. If the Fail safe feature detects a job that is not responding (not processing a CI) for more than 60 minutes, it restarts the non-responding job.

This feature logs all the traces in the arrecond.log file. Locate the log file if you have configured it to reside in a particular directory. Usually, this log file resides in the installation directory. For example, \Program Files\BMC Software\AtriumCore\Logs.

You should consider the following while working with the Fail safe feature:

  • The time interval for which a job can remain idle is configurable.
  • The configured time interval is in minutes.
  • The default job idle time is 60 minutes.
  • The job idle time can be specified in the range between 60 to 720 minutes.

To configure the above settings, use the configuration options provided in Server Configuration.

This version of the documentation is no longer supported. However, the documentation is available for your convenience. You will not be able to leave comments.

Comments

  1. Worawut Ekarat

    Why can't we specify a lower minimum range? 30-minute, or even 15-minute, range may be appropriated for some customers. Could you please check if this is correct? Thanks!

    Dec 09, 2014 10:27
    1. Prachi Kalyani

      Hello,

      Thank you for your comment. The default idle time is 60 minutes, because there may be some jobs which take longer to complete. If the idle time is changed to less than 60 minutes, the jobs will be restarted even if there are still processing.

      Hope that helps!

      Thanks,

      Prachi

      Dec 10, 2014 01:04
      1. Worawut Ekarat

        Thanks. But, this is idle time, right? Why the whole job's duration time would matter?

        Dec 10, 2014 01:15
  2. Worawut Ekarat

    Can you also provide a specific message in arrecond.log when this feature is triggered? This would be helpful to find out if this feature is being utilised or not. Hopefully, it is not just a message that contains the details of the reconciliation engine and its version which is presented when you restart the engine.

    Thank you.

    Dec 09, 2014 10:34
    1. Amol Redij

      Hello,

      Thank you for your comment.

      I will consult with the concerned SME and get back to you.

      Regards,

      Amol

      Dec 17, 2014 03:18
    1. Amol Redij

      Hi,

      There will be following error messages in the arrecond.log file when the feature is enabled:

      ERROR: Job: <jobname> is not responding

      ERROR: Restarting the job: <jobname>

      Regards,

      Amol

       

      Dec 18, 2014 03:56