Default language.

Troubleshooting service failover issues or server group issues with Email Engine


If there are service failover issues or server group issues with Remedy Email Engine, your Email Engine might not work correctly.

Use the information in this topic to troubleshoot service failover issues or server group issues with Email Engine. You can use this information to also troubleshoot issues with stand-alone Email Engine servers 9.x and later. Alternatively, create a BMC Support case.

Best Practice

We recommend that you refer to BEST FAQ on AR System Email Engine Issuesfor resolutions to the most common questions on Email Engine.


Related topic

Issue symptoms

  • Incoming or outgoing mailboxes stop processing emails, which makes you restart the Email Engine process.
  • The secondary Email Engine server doesn't start Email Engine operations when a service failover operation is triggered.
  • Email Engine doesn't process incoming or outgoing emails after you upgrade from an environment earlier than 9.x.

See Resolutions for common issues.

Issue scope

  • One or more Email Engines are affected in a server group.
  • Outgoing email notifications, incoming emails, and approvals by email are affected.

Diagnosing and reporting an issue

Task

Action

Steps

Reference

1.

Review the service failover configuration.

  1. Verify that the companion server name in the emaildaemon.properties file uniquely identifies the Email Engine host.
    This is often the Fully Qualified Domain Name (FQDN) of the host and is usually the same as the value in the Server-Connect-Name field for AR server. If the email engine is deployed as a separate service, then the companion server name may be different.
  2. Verify the server name in the emaildaemon.properties file.
    This can either be the Load Balancer name (same as the Server-Name field) or the local host name (Server-Connect-Name) depending on if you want to load balance the traffic from Email Engine to your AR servers.
  3. Verify that the service failover operation is correctly ranked in the AR System Server Group Operation Ranking form. Consider ranking this operation on more than one server.
  4. Verify the rank in the AR System Service Failover Ranking form for each mailbox based on Service Name and Server Name.

2.

Check the existing logs for any important failover messages.

Check the email log for server group failover messages. The following are some of the example messages that can indicate a failover activity:

  • Received event for activating...
  • Activating service provider...
  • Successfully activated service provider...
  • Received event for suspending...
  • Suspending service provider...
  • Successfully suspended service provider...

Use the complete message from the email log to help determine the nature of the issue.

3.

Monitor the failover status.

  1. Check the Last Heartbeat and Status fields for each provider that is listed in the AR System Service Failover Whiteboard form to determine which mailboxes are not updating properly.
    Make sure that:
    1. The Last Heartbeat should be updated every 30 seconds for each record in the AR System Service Failover Whiteboard form for the listed *emaildaemon://* Service Name records.
    2. The Status for each mailbox that is listed in the AR System Service Failover Whiteboard form should be one of the following:

      • Active—The service provider is processing requests. The mailbox is processing emails without any issue.
      • Suspending—The service provider is instructed to complete or abort any in-progress request and give up the ownership of the service. The mailbox is going to pass the operation to another server in the server group.
      • Waiting—The service provider is available to process a request if the service provided starts to process the emails again and if it's going to change from Waiting to Active.
      • Unavailable—The heartbeat is an API call that constantly checks if the Email Engine server is up and running. If the heartbeat from the service provider is no longer being detected, it shows the status of the mailbox as Unavailable. This normally happens because the mailbox configuration is wrong, or there is an issue with the connection, or the Email Engine service is down.

      Important

      If the Status is Active for one or more mailboxes having the rank 1 and the Last Heartbeat is being updated, this means the AR System Email Engine mailboxes with rank 1 are correctly connecting with the AR System server and they should be properly processing emails. If the Status is not Active or the Last Heartbeat is not being updated for one or more mailboxes having the rank 1, this usually means that Email Engine is not starting or is not configured correctly.

  2. Verify that the Owning Server field for the *emaildaemon://* operations in the AR System Server Group Current Ranking form is correct as defined for each mailbox that is listed in the AR System Service Failover Ranking form as per the entries ranked as #1 rank.

4.

Enable the logging.

Enable the Email Engine logs in the FINEST mode and also the API Recording log.

ARAPILOGGING helps to show information about all the API calls that Email Engine is making to the AR System servers in the server group to validate if the service is up and running. Verify that the service is sending the heartbeat.

5.

Reproduce the issue.

After the logging is enabled, restart the Email Engine having rank 1 and keep the logging on for 10 to 20 minutes or until you observe an issue.


6.

Disable the logging.

To disable the logging, see the knowledge base articles that you used to enable logging:

Important

The email.log file is populated very quickly with the FINEST mode enabled. Therefore, enable the logging just until the issue is reproduced.

  • Set the Email Engine logging back to SEVERE, which is the default value.
  • Set the API Recording level to 0 in the arsys_api.xml file or rename the file, and then restart the Email Engine process.

7.

Analyze the logs.

Review the logs yourself to try to identify any error messages or behaviors.

8.

Collect the logs and configuration files.

Copy the logs to another location where you can review them so that they do not get overwritten.

  • Log locations
    • (UNIX)/opt/bmc/ARSystem/AREmail/Logs
    • (Windows)C:\Program Files\BMC Software\ARSystem\AREmail\Logs
  • Configuration files
    • emaildaemon.properties
    • ar.cfg/ar.conf

9.

Create a BMC Support case

  • Collect and send logs and detailed information to the Support when you create a case with BMC Support.
  • Provide the following information as part of your case:
    • Time when the testing was performed
    • Error messages received, if any
    • Servicer failover and ranking data from step 1 and step 2 or screenshots showing the records for each of the forms.
  • Run the Log Zipper utility.
  • Include the email.log file and the RemedyApplicationService<servername><port>_arapires.log file collected during the test in the ZIP file.
  • Attach a ZIP file to your case. You can attach a ZIP file of size up to 2 GB. You can also upload the file on FTP.

Resolutions for common issues

After you determine a specific symptom or an error message, use the following table to identify the solution:

Symptom

Action

Reference

You notice incorrect service failover ranking on the AR System Service Failover Ranking form or on the AR System Service Failover Whiteboard form.

Verify that the operations and components are ranked correctly.

You notice that the service provider state is Unavailable on the AR System Service Failover Whiteboard form.

Determine why the status is Unavailable.

Incoming or outgoing email messages stop processing.


It's possible that a thread is getting blocked from the coordinator server or on the coordinator server of the current server group.

Restart the current coordinator server.

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*