This documentation supports the 23.3 version of BMC Helix ITSM: Service Level Management.

To view an earlier version, select the version from the Product version menu.

Troubleshooting Business Rules Engine issues in server groups

Important

Starting with the version 9.1.04, BMC does not provide support for the SLM Collector and any BMC provided integrations that make use of the SLM Collector.

You might encounter the following issues with the Business Rules Engine in server groups.

Issue symptoms

You might get ARERR 92 and ARERR 94 (database time-out) errors, ARERR 93 (busy server) errors, and ARAPPERR 4502 and ARAPPNOTE 4501 errors. One of the most important symptoms is the confirmation of BRIE server termination (crash) and restart from review of the entries in armonitor.log file.

The following are the sample error message that you might get:

  • ddd mmm xx hh:mm:ss yyyy BRIE : Operation cancelled due to error (ARAPPERR 4502)
  • ddd mmm xx hh:mm:ss BRIE : Operation cancelled due to error (ARAPPERR 4502) ddd mmm xx hh:mm:ss - -
  • ddd mmm xx hh:mm:ss BRIE : Timeout during database update – the operation has been accepted by the server and will usually complete successfully (<server>) ARERR - 92 ddd mmm xx hh:mm:ss BRIE : AR System Application server terminated – fatal error encountered (ARAPPNOTE 4501)
  • ddd mmm xx hh:mm:ss BRIE : Timeout during database query – consider using more specific search criteria to narrow the results, and retry the operation (ARERR 94) ddd mmm xx hh:mm:ss BRIE : <server>

Issue scope

Business Rules Engine (BRIE) errors are reported to the arerror.log file during the build of service targets, at startup, and when there is no BRIE server activity other than normal polling action. The BRIE server is a client connection to the AR System server, and any systemic issues that cause ARERR 92, 93, 94 errors for users accessing the system have a similar impact on the BRIE server connection.

Entries in arerror.log file which include ARERR 4502 errors indicate failure of a service target build. However, not all service target build failures result in ARERR 4502 errors being generated to the arerror.log file. 

Resolution

If the errors for the BRIE server in the arerror.log file are reported during the same time frame as the user reporting these same errors, they can be considered as a symptom of underlying system issues (for example, poor database performance, inadequate resources, and so on) rather than a source of a problem. In that case, to troubleshoot these errors, review the logs and system performance that these errors require. 

BRIE server issues

The following section covers some of the systemic problems that can impact the BRIE server when the server group configuration is either incomplete or incorrect.

Best practice

We recommend that you review the current configuration of a server group implementation to ensure that it is a robust and viable server group environment. Once you have verified the server group implementation, you can re-evaluate the BRIE errors.

To re-evaluate BRIE errors

  1. Restart all servers in the server group, Server-Group-Member: F.
    Restart the servers after you de-select the Server Group Member from the Configuration tab on the Application Administration Console or by manually setting the Server-Group-Member in the ar.cfg[conf] file. The next time you restart the AR System server, it starts without being a member of a server group.

    Best practice

    We recommend that you run one AR System server at a time during maintenance to prevent duplication of server group operations on the same database (for example, escalations). The system runs with a single AR System server for the duration of the maintenance period. However, this allows your production system to be online as it may not be possible to temporarily shut down production for a maintenance window.
    For a robust server group configuration, the following entries are in the ar.cfg[conf] file for each server in the server group:

    • Server-Name entry in common for all servers in server group
    • Server-Connect-Name as hostname (FQDN) of the server box where this member of server group is running
    • Separate IP-Name entry for each of the following fields, for every member in the server group:
      • hostname
      • FQDN
      • IP address
        For minimum server groups with two AR System servers, there are six IP-Name entries in each ar.cfg[conf] file: IP-Name: hostnameServer1 P-Name: FQDNServer1 IP-Name: IPAddressServer1 IP-Name: hostnameServer2 IP-Name: FQDNServer2 IP-Name: IPAddressServer2
  2. Enter the Domain-Name.
    Typically, the value for Server-Name is the host name of the server machine where the AR System server is running. The requirement to have a  Server-Name entry is common for all servers in a server group, as this default value is no longer suitable. Typically, the Server-Name entry is the name of the load balancer when load balancer is part of the environment.

    Important

    You do not need to have a load balancer in order to configure server groups.

    The value for the Server-Name entry is the alias for theAR System server group and can be any value that is appropriate. The Server-Name alias must be resolvable in the environment where each AR System server is running. You can verify this by using the ping utility.

To update the server group configuration

  1. Update the ar.cfg file for the primary AR System server to temporarily disengage server group functionality: Server-Group-Member: F
  2. Add IP-Name entries to the ar.cfg file for both AR System servers:
    • IP-Name: hostnameServer1
    • IP-Name: FQDNServer1
    • IP-Name: IPAddressServer1
    • IP-Name: hostnameServer2
    • IP-Name: FQDNServer2
    • IP-Name: IPAddressServer2
  3. Shut down both primary and secondary AR System servers.
  4. Restart the primary AR System server.
  5. Verify the value for the Server-Group-Name on the Advanced tab in the Application Administration Console (not required for AR System server 7.5 and later).
  6. Update the ar.cfg[conf] file for the primary AR System server to engage server group functionality: Server-Group-Member: T
  7. Restart all AR System servers in the server group.
  8. Verify the entries in the Server Group Operation Ranking form, specifically whether the expected values are listed for Server field options. The expected values are a single option for each server in the server group with values corresponding to the Server-Connect-Name for that AR System server.
  9. If the entries listed for Server field options in the Server Group Operation Ranking form are as expected, then maintenance on server group configuration is complete. If the entries listing for Server field are not what is expected, the following tasks are required:
    1. Delete all entries in the form and re-enter them.
    2. In some cases, it might  be necessary to do some maintenance at the database level and delete entries in the server group tables, but this should be done under the supervision of technical support as changes at the database level are not supported. If you need to update server group tables at the database level, provide output from the following queries to technical support:
      select * from servgrp_applic
      select * from servgrp_board
      select * from servgrp_config
      select * from servgrp_ftslic
      select * from servgrp_op_mstr
      select * from servgrp_userlic*

      Best practice

      We recommend that you not make any changes at the database level until you have a viable database backup, especially in critical production systems and only under the specific direction of technical support.

    3. If you made any changes to the Server Group Operation Ranking form, you must restart all AR System servers in the server group. When the server group is properly configured, the ar.cfg[conf] file for each server in the server group is updated with entries for the server group operations where the value of F indicates the AR System server ranked as 1 for that operation and T indicates the AR System server is not ranked as 1 for that operation. For example:
      Approval-Server-Suspended: F (or T)
      Assignment-Engine-Suspended: F (or T)
      Business-Rules-Engine-Suspended: F (or T)
      CMDB-Service-Suspended: F (or T)
      Reconciliation-Engine-Suspended: F (or T)

      For server group configuration where BMC SLM is a component of the environment, it is required that BRIE has the same ranking as the Administration operation. To build service targets, you must log directly into the AR System server where Administration operation is ranked as 1.

Was this page helpful? Yes No Submitting... Thank you

Comments