Troubleshooting BMC PATROL Agent-to-Agent communications
In order to troubleshoot BMC PATROL Agent-to-Agent Communications, the installation of the BMC PATROL Agent and UNIX KM must be finished on the remote node. In addition, you must have the PATROL account that you use to start BMC PATROL Agent, and know the port number that BMC PATROL Agent uses.
In this topic, Local host refers to the node where the database server and BMC PATROL for Oracle e-Business Suite run, and Remote host refers to the middle-tier server host where the middle-tier servers such as forms server and web servers run.
This topic contains the following information:
- Adding a middle-tier server host
- Assuring that the BMC PATROL Agent is up and running on the remote host
- Mechanism of BMC PATROL Agent-to-Agent communications
- If the PATROL password has changed on the remote node
- When AgentStatus of a remote host is suddenly in ALARM
- Performing OA configuration
- Performing configuration of a middle-tier server (general)
- Configuring a forms server for monitoring
- Configuring an Apache server for monitoring
- AgentStatus alarm, web and/or forms status OK
- If you cannot locate the source of the problem
Adding a middle-tier server host
You must add a remote host for monitoring before you can add or configure a middle-tier server (forms server, web server etc.) for monitoring.
- On the $ icon, access the OA Configuration menu command.
- Select Add Middle-Tier Server Host. The following fields must be completed:
- Host name: the remote host name or its IP address that your local host should communicate with.
- Port number: port number assigned to the BMC PATROL Agent running on the remote host.
- PATROL account and password: the OS account on the remote host that will be used for agent-to-agent communications.
Assuring that the BMC PATROL Agent is up and running on the remote host
When BMC PATROL for Oracle e-Business Suite is up and running, an instance of the remote host is created under OA_OS. If the KM is able to converse with the Agent on the remote host, the status of the remote host instance should be OK (icon highlighted), and its parameter, AgentStatus, in OK status. Otherwise this parameter should stay in ALARM state.
You can verify Agent status by issuing the following command on the System Output Window (SOW) of the local host:
%PSL print(get("/OA_OS/<host instance>/platform"));
You can get <host instance> by typing:
%PSL print(get("/OA_OS/instances"));
The value of the platform variable is the machine type of the remote host. If the value of this variable is blank, BMC PATROL for Oracle e-Business Suite has not yet communicated with the remote host, or the communications are in progress and not yet finished. When BMC PATROL for Oracle e-Business Suite exhausts its retry count without a success, it will set the AgentStatus to ALARM state.
Mechanism of BMC PATROL Agent-to-Agent communications
The communication messages are contained in a text file, and the file is transferred from the remote host to the local host. Most of the time, the messages are small. However, certain messages such as PS_SHARE and process data can be large, depending on the size of the OA system. It is expected that the size of the file will not exceed one MB.
To assure that agent-to-agent communications go smoothly, confirm the following:
- The /AgentSetup/defaultAccount has valid information for the PATROL account on the local host.
- Disk space is available on $PATROL_HOME on the remote host
- PATROL account has write privilege on the $PATROL_HOME/remote directory on the remote host
- Disk space is available on $PATROL_HOME/remote ($PATROL_REMOTE if set) on the local host
- PATROL account has write privilege on the directory of $PATROL_HOME/remote ($PATROL_REMOTE if set) on the local host
- Check the CPU loading and resources availability on the remote host. When the CPU loading is too heavy on the remote host and too many requested PSL executions occur at the same time, it will take a very long time for the Agent to finish the executions, thereby causing an Exhaust retry count error on the local host.
If ACLs (Access Control Lists) are in use, confirm that you have permitted PEM communication between the hosts in question. See Access Control List (ACL) Permissions.
If there is a memory availability problem on the remote host, the BMC PATROL Agent might trigger an OS error of cannot fork: too many processes in the process of PSL execution. This error also results in an Exhaust retry count error on the local host.
If the PATROL password has changed on the remote node
On the remote host:
- Stop the BMC PATROL Agent
- Log into a new terminal session with PATROL account and start PATROL Agent again.
On the local host:
- Do not stop the PATROL Agent or break the current connection.
- Delete the middle-tier server host. From the OA Configuration menu command, select Delete Middle-tier Server Host.
- Add the middle-tier server host again. From the OA Configuration menu command, select Add Middle-tier Server Host. You have to supply the new password.
When AgentStatus of a remote host is suddenly in ALARM
This implies that the agent-to-agent communications were broken.
- Check the annotation of OSPerformance of OA_COLLECTOR. The annotation text will tell you what happened.
- Check if the BMC PATROL Agent on the remote host is still running.
- Check if the BMC PATROL account on the remote host is still valid, and if its password was changed.
- Check if the BMC PATROL Agent is stuck somewhere. This can occur when the file system is down or heavily loaded, or the network connection is timed out.
In the case of file system or network issues, the error is not critical. When the file system or network becomes normal again, communications will resume automatically.
Performing OA configuration
The menu screen of OA configuration allows user to perform configuration of an OA instance as well as the configuration of middle-tier servers. If you are an inexperienced user, it is suggested that OA instance be configured alone. Perform the following procedures:
- From OA Configuration, select Add OA Server Configuration.
- Fill in the fields on the screen, but do not select the buttons for configuring middle-tier servers. Finish configuring the OA instance.
- Add all the remote hosts where the middle-tier servers run, and where you will configure the middle-tier servers for monitoring.
- Exit the OA Configuration menu screen. Wait until the OA instance is discovered.
- On the OA instance (money bag) icon, select the Configure Middle-tier server menu command. You can add the middle-tier servers from this menu command.
Performing configuration of a middle-tier server (general)
A middle-tier server is a web server, a forms server, a metrics client or a metrics server running on either a local host or a remote host.
You must add the remote host for monitoring before you can configure the middle-tier server for monitoring. Add the remote host once, and the added remote host will be shared by different types of middle-tier servers during configuration.
You do not need to add the local host, which is added by BMC PATROL for Oracle e-Business Suite automatically.
Configuring a forms server for monitoring
Ensure that the Agent on the remote host is up and running with UNIX KM before you can perform configuration for the Forms server. If the PATROL Agent is down, you cannot perform any configuration.
On the Forms server configuration screen, there are two fields that can cause agent-to-agent communications to fail:
- PATROL Agents Run On
This field lists the remote hosts you have added for monitoring. Select the remote host with the port number that reflects the BMC PATROL Agent running on the remote host with that port. - User name/Password
This account must be a valid OS account on the remote host. This account is not used for monitoring the Forms server. It must be able to run the ps -ef UNIX shell command, and it must have privilege to restart the forms server in recovery action.
If you do not intend to use this account to restart the forms server, it is suggested that the PATROL account be used.
If the ps -efl command is used in your environment, use the Define ps command style menu command of the OA_ALL_SYS application class to switch to the ps -efl command setting.
Configuring an Apache server for monitoring
Ensure that the Agent on the remote host is up and running with the UNIX KM before you can perform configuration for the Apache server. If the BMC PATROL Agent is down, you cannot perform any configuration.
On the configuration screen of the Apache Server, there are two fields that might cause agent-to-agent communications to fail:
- PATROL Agents Run On
This field lists the remote hosts you have added for monitoring. Select the remote host with the port number that reflects the BMC PATROL Agent running on the remote host with that port. - Web User name/Password
This account must be a valid OS account on the remote host that has read privilege to access the <Common Top>/admin/scripts/adapcctl.sh file as well as the Apache configuration file (httpd.conf).
AgentStatus alarm, web and/or forms status OK
It is possible for AgentStatus on the remote host to be in ALARM state while FormsServerStatus of the Forms server on the same computer stays OK. This scenario also applies to web servers.
This occurs when the collector of OA_OS has run and found the status of the BMC PATROL Agent on the remote host, while the collector of the Forms server has not yet run. If the failure is solid (non-transient), the FormsServerStatus will eventually be in ALARM state. In this case, check the BMC PATROL Agent on the remote host.
If you cannot locate the source of the problem
Corruption of the PCONFIG database can prevent agent-to-agent communication. In order to test this possibility, start an Agent (database server only) on a port which has never been used, thus resulting in a clean PCONFIG database. Test the configuration again.
If this does resolve the problem, then you have the option of stopping the Agent that uses the desired port number on the database server, and deleting the PCONFIG database from that install. This will require a complete reconfiguration of all changes to any KM loaded, and therefore should only be attempted as a last resort.