Troubleshooting DMT and UDM errors in a server group environment


While migrating the data in a server group environment, issues related to load balancing, Unified Data Management (UDM) form configuration, or server authentication may occur. Follow the troubleshooting instructions to diagnose and resolve the issues.


Best practice
We recommend that you select a non-user facing server (for example, the admin server) as the primary server. Additionally, select the default check box for the primary server in the UDM:Config form.

Diagnosing and reporting an issue

Task

Action

Steps

1

Verify that you have ranked the Atrium Integrator servers by using the AR System Server Group Operation Ranking form.

Before configuring the UDM:Config form in a server group environment, we recommend that you rank the Atrium Integrator servers by using the AR System Server Group Operation Ranking form.

If you assign rank 1 to a server, that server becomes the primary server and runs the jobs. If you assign rank 2 to a server, that server becomes the secondary server.

If the primary server fails, the secondary server (failover server) runs the jobs. If you do not assign ranks to the servers in a server group environment, the jobs run on the server that first receives the request.

For more information, see Setting failover rankings for servers and operations.

2

Perform the UDM:Config form configuration checks.

3

Perform theUDM:RAppPassword form checks.

4

Perform the UDM:PermissionInfo form checks.

5

Check the workarounds for common issues and resolve the problem.

6

If none of workarounds for common errors resolves the problem, gather the logs before contacting Support.

UDM:Config form configuration checks

Ensure that you perform the following checks while troubleshooting the UDM:Config form configuration:

  • The Atrium Integrator engine server name must be the same as the value of the Server-Connect Name parameter in the ar.cfg file.
    AIE.png
  • The host name for each entry must match the server host name defined in the armonitor.cfg file for the diserver and Carte plugin.
  • No long names, aliases, or IP addresses are used as host names in the UDM:Config form.
  • The value of the Is Default field is set to Yes for the primary server (ranked as 1 in the AR System Server Group Operation Ranking form).
  • The Failover server name field must not have any entries.
  • The port value must be 20000.

Example: To configure entries in a AR server group environment with three servers, we recommend that you enable the diserver or Carte plugin for all the three servers. If the default server goes down, the secondary server (ranked as 2 in the AR System Server Group Operation Ranking form) runs the jobs, as the plugin is available and is running the jobs that are created when server 1 goes down. The job always runs on the primary server irrespective of where it was triggered from. Additionally, on BMC Helix ITSM server, the user session is on, since the Is Default field value is set to Yes on the primary server.


Example1.png

Example2.png

Example 3.png

We recommend that you always start the BMC Helix ITSM server service after you make changes to the UDM:Config form.

Important

When the primary server goes down, if three UDM jobs are running, you must review these jobs and then manually create a new job with the non-promoted data after which you must run the job on the second server.

This failover procedure is not automatic; you must manually run the jobs on the secondary server, if the primary server goes down.

UDM:RAppPassword form checks

The UDM:RAppPassword form authenticates the Application Service password for $SERVER$ from Mid Tier and then finds the correct server name from the UDM:Config form.

The UDM:RAppPassword form must contain the entries for all the possible server names that can be used to connect to BMC Helix ITSM server, including the following:

  • IP addresses
  • Alias names
  • Load balancer names

After you make changes to the UDM:RAppPassword form, you need not restart BMC Helix ITSM server.

Configure the UDM:RAppPassword similar to the UDM:Config form. For example:

  • newsc-s: AR Server alias name
  • newscorp-vip: LB alias name

Config.png

The UDM:PermissionInfo form is a regular form that contains a list of all the Pentaho transformations, jobs, database connections, slave servers, partition schemas, directory, cluster schemas, and the corresponding user group permissions for the 112 field.

UDM:PermissionInfo form checks

While load balancing, if the Carte Server Name (optional) is set for a particular transformation or job, then the plugin always executes that transformation or job on that particular Carte server. This ensures that the load balancing of the data integration jobs is done across multiple Carte servers. If the Carte Server Name is not configured for a transformation or job, then that transformation or job is always executed on the local Carte server.

If you have performed a database migration, the following list of forms contains the required data. Make sure that you update the server references in the following forms:

  • UDM:ExecutionInstance form
  • UDM:PermissionInfo form
  • UDM:Config form
  • UDM:RAppPassword form

Make sure that you clean the data in the following forms:

  • DMT:Thread Manager form
  • CAI:Events form
  • CAI:EventParams form
  • UDM:Variable form

Gathering logs

Use the following log files to troubleshoot the UDM Load Balance configuration:

File name

Location

arjavaplugin.log

C:\Program Files\BMC Software\ARSystem\Arserver\Db directory

arcarte.log

C:\Program Files\BMC Software\ARSystem\Arserver\Db directory

arerror.log

C:\Program Files\BMC Software\ARSystem\Arserver\Db directory

ar.cfg

C:\Program Files\BMC Software\ARSystem\Conf directory

pluginsvr_config.xml

C:\Program Files\BMC Software\ARSystem\pluginsvr directory

Resolutions for common issues

The following table lists the common errors that occur while running a DMT or UDM job:

Symptom

Action

The following error messages are displayed:

  • CI-CS-CMDBErrorOutput.0 - ERROR: org.pentaho.di.core.exception.KettleDatabaseException:
  • CI-CS-CMDBErrorOutput.0 - ERROR: Did not find Remedy Application Service password for server X  in UDM:RAppPassword Form on server Y

Verify that the load balance server is listed in the UDM:RAppPassword form and the Windows Host file contains the following parameters:

  • AR Server IP servergroupname
  • AR Server IP servergroupname.domain.net

A Data Management job is stuck in the In Progress state for a long time. The arjavaplugin logs display an error message about an invalid execution instance or an error relevant to creating an entry in the UDM:Execution form.

  • Restart your AR Server and re-run the Data Management job.
  • Remove the execution entry of the Data Management job from the UDM:ExecutionInstance form and re-run the job.

The job fails with the following error messages:

  • Error Connecting to ARSystem
  • Did not find Remedy Application Service password for server xxxxxxx in UDM:RAppPassword Form

Make sure that the UDM:Config form and the UDM:RAppPassword form contain the correct server entries.

The following error message is displayed:

ERROR (90): Cannot establish a network connection to the AR System server; servername:31500

Make sure that the UDM:Config form and the UDM:RAppPassword form contain the correct server entries.


The following error messages are displayed:

  • Error while fetching data from form UDM:ExecutionStatus
  • ERROR (623): Authentication failed; aradmin
  • Make sure that the UDM:Config form is configured correctly.
  • Make sure that the UDM:RappPassword form is configured and the passwords are correct.

The following error messages are displayed:

  • ARDBCPluginRepository.java:445 > getListEntryWithFields() FAILs in plugin: ARSYS.ARDBC.PENTAHO
  • ERROR (623): Authentication failed; aradmin
  • Make sure that the UDM:Config form is configured correctly.
  • Make sure that the UDM:RappPassword form is configured and the passwords are correct.

The following error messages are displayed:

  • ERROR [pool-4-thread-25] com.bmc.arsys.pluginsvr.plugins.a (?:?) - createEntry() FAILs in plugin: ARSYS.ARDBC.PENTAHO
  • ERROR (8753): Error in plugin; servername.name.com

Verify that the load balance server is listed in the UDM:RAppPassword form, and the Windows Host file contains the following parameters:

  • AR Server IP servergroupname
  • AR Server IP servergroupname.domain.net

The following error messages are displayed:

  • Error in plugin : servername.xyz.com (ARERR 8753)
  • An application command failed. (ARERR 4554)
  • Application-Delete-Entry "DMT:Action" 000000000060008

Verify that the load balance server is listed in the UDM:RAppPassword form, and the Windows Host file contains the following parameters:

  • AR Server IP servergroupname
  • AR Server IP servergroupname.domain.net

The following error messages are displayed:

  • Error in plugin: No Carte Server with name servername exists in UDM:Config form. (ARERR 8753)
  • 390626: An application command failed. (ARERR 4554)
  • Application-Delete-Entry "DMT:Action" 000000000002304

Verify the server entries in the UDM:Config form.

The data is not captured in the arcarte and arcarte-stdout log files.

In a server group environment, verify that the logs are written on the correct servers outlined in the UDM:Config form.

If you see that the logs are written to the secondary server and not the default primary server, the primary server has issues.

Make sure that the primary Atrium Integrator server is up and running. Review the arjavapluigin.log, arerror.log, and the arcarte log files for plugin errors.

The following error is displayed in the Load Error log from the DMT Console and also in the arjavaplugin log on the non admin server:

Error Connecting to ARSystem 
Error while impersonating [my user name] Cannot establish a connection to the ARSystem server [admin server alias]:[TCP]

To resolve this error, see Knowledge Article number 000201959 (Support logon ID required).

Knowledge articles

The following table lists knowledge articles that help in troubleshooting issues related to UDM and DMT:

Title

Knowledge article

Troubleshooting DMT UDM Load Balance and Server Group issues

Troubleshooting DMT UDM Promote issues

Troubleshooting DMT or UDM validate issues during a job load

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*