TCP agent connection troubleshooting


Use the following instructions to troubleshoot a TCP agent connection.

To troubleshoot a TCP agent connection

  1. Confirm the version of the agent.
    • (UNIX) -- Run the pw version command.
    • (Microsoft Windows) -- Go to Control Panel > Add/Remove Programs and scroll down to the instance to check for the agent version number. The agent version must be the same or it must be an earlier version when compared to the server version.
  2. Confirm that the target IP address on the agent status monitor is correct.
  3. If the monitor measurement is in question, confirm the source IP address of the monitor.
  4. To ensure that the agent is reachable on its listening port from the server (the agent is reachable on port 12124), run the following command: telnet agentIP 12124
  5. If you get a connection established message followed by garbage characters, the agent is accessible, listening, and responding.
  6. If the connection times out, the agent is either not running or is inaccessible for network reasons.
  7. Ensure that the firewall meets the following requirements:
    The requirement of the TCP agent is that the server can initiate a connection on port 12124 (or whichever port was chosen on installation).
    If the firewall employs Network Address Translation (NAT), it must use static one-to-one address translation. If dynamic NAT is used, the agent cannot connect a second time. If many-to-one translation (IP masquerading) is used, and the agent is in the private address space, the server cannot initiate the connection. The NAT with many-to-one translations is used only for initiating IP connections from the private space to the legal space, not the other way around.
  8. Ensure that the agent is running on the agent computer by checking the agent service status:
    (UNIX): Run the following command to check the agent service status: pw agent status
    (Microsoft Windows): Check the agent service status in Control Panel > Services.
  9. Ensure that the agent uses the netstat -an command to listen. This command works for agents on UNIX and on Windows.
  10. In the output, in the left column, search for the LISTENING entry for port 12124. For example: TCP 10.10.19.201:12124 0.0.0.0:0 LISTENING
  11. If the connection is established, it indicates the connecting IP address in the right column. Ensure that this is the IP address of the BMC TrueSight Infrastructure Management Server. For example, TCP 10.10.19.201:12124 10.10.89.216:61144 ESTABLISHED. If the IP address in the right column does not belong to your system, it indicates that the connection was stolen by another.
  12. You can also use the netstat command to check the connection on the server side. The server must show a connection with an entry similar to the example in the preceding step (only with the left and right columns reversed).

    If the server does not display a corresponding ESTABLISHED entry, or if it has a corresponding TIME_WAIT entry, the agent is probably in an inactive state, and the server has severed the connection due to lack of response.

  13. If one or more TIME_WAIT entries appear on the agent side (CLOSE_WAIT entries on the server side), the agent closed the connection because it was restarted by a human operator or by the internal protection mechanisms of the agent. For example, TCP 10.10.19.201:12124 10.10.89.216:61144 TIME_WAIT

    If one or more CLOSE_WAIT entries appear on the agent side (TIME_WAIT entries on the server side), it indicates that the network connection is being interrupted by network problems. The server also severs the connection if it is restarted or if the agent fails to respond to the heartbeat signals in a timely fashion. It can also indicate that the agent process is not responding due to a system resource conflict, or a malfunction in one of its monitors. For example, TCP 10.10.19.201:12124 10.10.89.216:61144 CLOSE_WAIT

    If the agent is listening on port 12124, and a telnet can be established on port 12124 from the server, but the server still considers the agent unreachable, the agent is most likely not responding.

 

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*