Troubleshooting DMT and UDM errors in a server group environment
Diagnosing and reporting an issue
Task | Action | Steps |
---|---|---|
1 | Verify that you have ranked the Atrium Integrator servers by using the AR System Server Group Operation Ranking form. | Before configuring the UDM:Config form in a server group environment, we recommend that you rank the Atrium Integrator servers by using the AR System Server Group Operation Ranking form. If you assign ranking 1 to a server, that server becomes the primary server and runs the jobs. If you assign ranking 2 to a server, that server becomes the secondary server. If the primary server fails, the secondary server (failover server) runs the jobs. If you do not assign ranking to the servers in a server group environment, the jobs run on the server that receives the request first. For more information, see Setting failover rankings for servers and operations. |
2 | Perform the UDM:Config form configuration checks. | |
3 | Perform theUDM:RAppPassword form checks. | |
4 | Perform the UDM:PermissionInfo form checks. | |
5 | Check the workarounds for common issues and resolve the problem. | |
6 | If none of workarounds for common errors resolves the problem, gather the logs before contacting Support. |
UDM:Config form configuration checks
Ensure that you perform the following checks while troubleshooting the UDM:Config form configuration:
- The Atrium Integrator engine server name must be the same as the value of the Server-Connect Name parameter in the ar.cfg file.
- The host name for each entry must match the diserver host name defined in the armonitor.cfg file for the diserver and Carte plugin.
- No long names, aliases, or IP addresses are used as host names in the UDM:Config form.
- The value of the Is Default field is set to Yes for the primary server (ranked as 1 in the AR System Server Group Operation Ranking form).
- The Failover server name field must not have any entries.
- The port value must be 20000.
Example: To configure entries in a Remedy AR server group environment with three servers, BMC recommends that you enable the diserver or Carte plugin for all the three servers as a best practice. If the default server goes down, the secondary server (ranked as 2 in the AR System Server Group Operation Ranking form) runs the jobs, as the plugin is available and is running the jobs that are created when server 1 goes down. The job always runs on the primary server irrespective of where it was triggered from. Additionally, on AR System server, the user session is on, since the Is Default field value is set to Yes on the primary server.
(Click the image to expand it.)
We recommend that you always start the AR System service after you make changes to the UDM:Config form.
UDM:RAppPassword form checks
The UDM:RAppPassword form authenticates the Remedy Application Service password for $SERVER$ from Mid Tier and then finds the correct server name from the UDM:Config form.
The UDM:RAppPassword form must contain the entries for all the possible server names that can be used to connect to AR System server, including the following:
- IP addresses
- Alias names
- Load balancer names
After you make changes to the UDM:RAppPassword form, you do not need to restart AR System server.
Configure the UDM:RAppPassword similar to the UDM:Config form. For example:
- newsc-s: AR Server alias name
- newscorp-vip: LB alias name
(Click the image to expand it.)
The UDM:PermissionInfo form is a regular form that contains a list of all the Pentaho transformations, jobs, database connections, slave servers, partition schemas, directory, cluster schemas, and the corresponding user group permissions for the 112 field.
UDM:PermissionInfo form checks
While load balancing, if the Carte Server Name (optional) is set for a particular transformation or job, then the plugin always executes that transformation or job on that particular Carte server. This ensures that the load balancing of the data integration jobs is done across multiple Carte servers. If the Carte Server Name is not configured for a transformation or job, then that transformation or job is always executed on the local Carte server.
If you have performed a database migration, the following list of forms contains the required data. Ensure that you update the server references in the following forms:
- UDM:ExecutionInstance form
- UDM:PermissionInfo form
- UDM:Config form
- UDM:RAppPassword form
Ensure that you clean the data in the following forms:
- DMT:Thread Manager form
- CAI:Events form
- CAI:EventParams form
- UDM:Variable form
Gathering logs
Use the following log files to troubleshoot the UDM Load Balance configuration:
File name | Location |
---|---|
arjavaplugin.log | C:\Program Files\BMC Software\ARSystem\Arserver\Db directory |
arcarte.log | C:\Program Files\BMC Software\ARSystem\Arserver\Db directory |
arerror.log | C:\Program Files\BMC Software\ARSystem\Arserver\Db directory |
ar.cfg | C:\Program Files\BMC Software\ARSystem\Conf directory |
pluginsvr_config.xml | C:\Program Files\BMC Software\ARSystem\pluginsvr directory |
Resolutions for common issues
Consult the following table for workarounds to common issues that occur while running a DMT or UDM job:
Symptom | Action |
---|---|
The following error messages are displayed:
| Verify that the load balance server is listed in the UDM:RAppPassword form and the Windows Host file contains the following parameters:
|
A Data Management job is stuck in the In Progress state for a long time. The arjavaplugin logs display an error message about an invalid execution instance or an error relevant to creating an entry in the UDM:Execution form. |
|
The job fails with the following error messages:
| Ensure that the UDM:Config form and the UDM:RAppPassword form contain the correct server entries. |
The following error message is displayed: ERROR (90): Cannot establish a network connection to the AR System server; servername:31500 | Ensure that the UDM:Config form and the UDM:RAppPassword form contain the correct server entries. |
The following error messages are displayed:
|
|
The following error messages are displayed:
|
|
The following error messages are displayed:
| Verify that the load balance server is listed in the UDM:RAppPassword form, and the Windows Host file contains the following parameters:
|
The following error messages are displayed:
| Verify that the load balance server is listed in the UDM:RAppPassword form, and the Windows Host file contains the following parameters:
|
The following error messages are displayed:
| Verify the server entries in the UDM:Config form. |
The data is not captured in the arcarte and arcarte-stdout log files. | In a server group environment, verify that the logs are written on the correct servers outlined in the UDM:Config form. If you see that the logs are written to the secondary server and not the default primary server, the primary server has issues. Ensure that the primary Atrium Integrator server is up and running. Review the arjavapluigin.log, arerror.log, and the arcarte log files for plugin errors. |
The following error is displayed in the Load Error log from the DMT Console and also in the arjavaplugin log on the non admin server: | To resolve this error, see Knowledge Article number 000201959 (Support logon ID required). |
Knowledge articles
The following table lists knowledge articles that help in troubleshooting issues related to UDM and DMT:
Title | Knowledge article |
---|---|
Troubleshooting DMT UDM Load Balance and Server Group issues | |
Troubleshooting DMT UDM Promote issues | |
Troubleshooting DMT or UDM validate issues during a job load |