Monitoring a reconciliation job
You can monitor the progress of a reconciliation job and take action for queued jobs on the reconciliation page or by using the fail-safe mechanism in the CMDB Portal.
In BMC Helix CMDB versions 20.08 or earlier, you had to view and cancel a queued job by using the RE:Job Runs and RE:Application Pending forms. InBMC Helix CMDB versions 21.02 and later, the queued jobs are visible in the Job Details section on the reconciliation page in CMDB Portal.
The Queued Jobs tab is visible only if there are any jobs that are unresponsive and queued. You cannot edit or restart a queued job directly. You must first cancel the job on the reconciliation page in CMDB Portal and then restart the job.
Before you begin
You must have some reconciliation jobs created or which are executed. To create a reconciliation job, see Creating a reconciliation job.
To monitor reconciliation jobs
- On CMDB Portal, select Jobs > Manage Reconciliation.
The following page is displayed.
- From the Dataset list, select a data set name for which you want to view the jobs, or select All to view jobs for all the data sets in CMDB.
Select the time period for which you want to view the jobs.
The ribbon component displays the following fields:
The values displayed in the Jobs ribbon component are run-time values. The values in the CIs and Relationships ribbon components are generated by the Datasources utility. This utility runs at scheduled times, as set in the Configurations menu. A message above the ribbon provides the date and time when the server last ran a reconciliation job.
Ribbon Component Field Description Jobs Total
Count of all jobs that have run at least once plus jobs that have not run even once.
Executed Count of jobs that have run at least once for the applied data set and time-period filters. Pending
Count of jobs that have not run even once for the applied data set and time-period filters.
CIs Total Count of all CIs processed in all the executed jobs. Good Count of CIs that are successfully reconciled for the executed jobs. Errors Count of CIs that failed reconciliation for the executed jobs. Relationships Total Count of all relationships processed in all the executed jobs. Good Count of relationships that are successfully reconciled for the executed jobs. Errors Count of relationships that failed reconciliation for the executed jobs.
- The time-period values are dependent on the the date of the dashboard utility run and not related to the actual calendar date.
- The information displayed for the Today option is the information populated in the last dashboard utility run. For example, you select Today in the time-period filter on 20 July 2017. If the dashboard utility was last run on 16 July 2017, the CI details that are displayed are for 16 July 2017. Additionally, if you select Last 7 days in the time-period filter, the CI details that are displayed are from 9 July 2017 to 15 July 2017.
Below the ribbon component, the All Job Run Activities table lists the details of each job in a row. Each row has the following fields:
Displays the job name that you provided when creating the job. Click the job name to view the reconciliation job details. The reconciliation job page displays the following information:
You can edit, delete, and start the job from this page.
Displays one of the following job statuses:
Displays the count of CIs that succeeded and the count of CIs that failed to reconcile.
Values in the green CIs column represent the number of CIs reconciled successfully, while values in the red CIs column represent the number of CIs that failed reconciliation. You can drill down into the failed CIs by clicking the link for the number of failed CIs. This drill-down page provides a Recommended Actions column that lists the solutions for resolving the failed CIs.
Displays the count of relationships that succeeded and the count of relationships that failed to reconcile.
Values in the green Relationships column represent the number of relationships reconciled successfully, while values in the red Relationships column represent the number of relationships that failed reconciliation. You can drill down into the failed relationships by clicking the link for the number of failed relationships. This drill-down page provides a Recommended Actions column that lists the solutions for resolving the failed relationships.
Displays the number of times the job has run so far. Click the number to view a page that displays the job run history such as the run status, start time, and end time for each run of the job. Click the Status of a job to expand the job row and view the Activities and Events tabs.
Important: A reconciliation job with multiple activities of the same type displays multiple job entries in the last-run job.
|Activities||Displays number of activities configured for the reconciliation job. Click the number to view the job row and view the Activities and Events tabs. The Activities tab displays the details of the Identify, Merge, and Purge activities configured for the job, the current status of each activity, and so on. The Events tab provides description for each of the activities such as, the number of records found, the number of failures for identification, and so on.|
If there are any corrupted jobs, the following message is displayed on the reconciliation page at the bottom of the list of all reconciliation jobs.
For more information about stale jobs, see Troubleshooting Reconciliation Engine jobs that do not start or finish.
To view, cancel, and restart a queued job
- On the ribbon, click Queued.
The list of queued jobs is displayed below the ribbon.
- Select the jobs that you want to remove from the queue and click Cancel Queued jobs.
After a job is cancelled, it is displayed as a pending job in Pending.
- To restart a job that was cancelled,
- Open the job from the Pending list.
- On the job page, click Start Job.
Overview of the fail-safe mechanism
The fail-safe mechanism in Reconciliation Engine monitors the progress of all the running jobs (scheduled, continuous, and non-continuous). When it encounters an unresponsive job, the mechanism automatically restarts the job.
For example, if you have configured the job idle time for 500 minutes, the fail-safe mechanism monitors all the jobs after every 500 minutes. If the fail-safe mechanism detects a job that is not responding for more than 500 minutes, it restarts that job. To configure Job Idle time, see To configure the fail-safe mechanism.
The following table describes the fail-safe mechanism actions that Reconciliation Engine takes for different types of reconciliation job runs:
|Type of reconciliation job||System action|
|Scheduled||Stops the current job run and lets the next run start after the scheduled interval of that job has elapsed.|
|Non-continuous||Stops the current job run and starts a new run.|
Stops the current job run and starts the next run after the interval of the current run has elapsed. The job does not stop immediately.
Important: If you make changes to a continuous job that is running, the changes will reflect only after the job stops and is restarted manually.
The fail-safe mechanism is available only when you configure a value for the Job Idle Time (minutes) parameter and all the traces in the arrecond.log file. If you do not want to configure the fail-safe mechanism, keep the default Job Idle Time value of 0 (Zero). The AR System installation directory is the default location of the log file. For more information about Job Idle Time, see Reconciliation Engine configuration settings.
To configure the fail-safe mechanism
- In CMDB Portal, select Configurations > Core Configurations > Reconciliation.
To define the duration that the job can remain idle, set a value in Job Idle Time (minutes).
The default job idle time is 0 minutes.
The following video provides an overview of the Reconciliation component in the dashboard.