Best practices when syncing data from BMC Helix ITSM SaaS to on premises

You might need to synchronize some of your data from your BMC Helix ITSM environment running in the cloud to your local system running on premises. Some of the most common reasons for this sync are:

Local reporting requirements
Local integration requirements
Data compliance

You can sync your data in several ways. However, in each of these cases, you should keep in mind the following best practices:

Bi-directional sync is not allowed.
Sync the minimal amount of data that you need.
Sync only incremental changes.
Run sync against the read-only reporting database.
Schedule the sync only when data needs to get updated.
Monitor the sync jobs.
Review customizations on upgrade to determine if changes are needed.

Bi-directional sync is not allowed

The data needs to move from SaaS to the on-premises environment. The only updates that are acceptable back into the SaaS environment would be for configuration purposes, such as tracking when the last job ran. The reason is that you want to keep the production database as your source of truth. You can also run into circular update issues if you have data being sent in both directions.

Sync the minimal amount of data that you need

As a best practice, look at the overall ITSM data model and sync only the data that is needed to support your use cases. BMC Helix ITSM has many forms that are present only for processing or configuration purposes. Do not sync any data that is not needed by the functionality that you are using on premises. For example, if you need only the incident information and related change records and not the detailed worklogs, you need to bring data from HPD:Help Desk, CHG:Infrastructure Change, HPD:Associations and CHG:Associations (if you need the relationship type from that direction) forms. You need not bring data from HPD:WorkLog and CHG:WorkLog forms.

The best way to sync the right data is to document your use cases and then map the forms that hold the needed data and any forms that store the pertinent relationships.

This best practice is also applicable to the fields that you need to sync on premises. Only pull fields that support your use cases. Doing so helps reduce the size of the data being sent to the on-premises system. For example, ITSM has many fields used for workflow or foreign key purposes. If such fields are not required for the on-premises use cases, do not sync them. This will reduce the overall size of each ticket and speed up the transfer.

Sync only incremental changes

Ideally, you must update only the data which has changed since the last sync. You can accomplish this by checking the last modified date on the data that you are pulling into the on-premises system. However, for newly created records, the last modified date is the same as the created date.

If there is data from a join form that you need to sync to on premises, the best practice is to sync the members of the join instead of the join itself. This ensures that only the changed data in the join member tables are being pulled instead of adding data from all members of a join, especially deep joins.

Run sync against the read-only reporting database

Since the sync is just going to be pulling data, as much as possible, run the sync against read-only pods or the read-only database. This helps ease the load of additional queries off the database that is used for running user-facing operations.

Schedule the sync only when data needs to get updated

When you are syncing data, determine the extent to which the on-premises data must be in sync with the data on the cloud. If possible, reduce the frequency so that the sync jobs can be run during off time. If some data needs to be close to real time, determine if you can limit that near real-time sync to just that data and not all data. The best case would be to run the sync operations during the weekends when there are a fewer number of users on the environment. Another approach is to have separate jobs that pull data at different intervals. Have a job for near real time data and limit that job to just the data that needs to be as accurate as possible and jobs that can run during times where there is less load for data that does not require refreshing as often.

Monitor the sync jobs

It is important to monitor your sync jobs to make sure they are running effectively. The following are some metrics to monitor:

Number of jobs that are running and the number of jobs running without errors
Amount of data the job is pulling
Amount of time taken for the job to run

Monitoring that the job is running and running without errors is important. This step ensures that your on-premises systems that are dependent on the data are getting the latest updated information.

Monitoring the amount of data also helps you keep track of how much information is changing in your environment, if the load is increasing, and if you might need to run the data sync more often.

The amount of data is also related to the amount of time that it takes the job to run. For example, you have scheduled your jobs to run during off hours, but due to additional load, the job takes longer and starts to run into times of the day when the end user traffic is starting to grow. As a result, you would need to adjust the job schedule to start earlier.

Review customizations on upgrade to determine if changes are needed

There will be times when fields are added during an upgrade or when a customization is done. As part of the reconciliation process of accepting the changes and moving customizations from Dev or QA system to production, a review is needed to determine if any of the added fields should be added to the on-premises environment. One approach would be to apply the upgrade D2P packages to your on-premises environment to make sure it is in sync.

To help identify any issues during an upgrade, the on-premises environment should also include a development and QA instance, so that changes can be tested and verified before moving them to production.