Troubleshooting RSCD agents can be a time consuming process that involves numerous error message types and reasons as to why they are caused. The purpose of this page is to assist with any connectivity issues that you may have with an RSCD Agent in your environment.
This page also includes a link to a list of common error messages, their probable causes, and a checklist for resolving the issue.
Overview of the troubleshooting process
The diagram below is a flowchart of how to troubleshoot an RSCD Agent issue. Troubleshooting issues with a RSCD Agent typically involves two issue classifications:
- Connectivity — Establishing communication from a server with NSH to a server with an agent
- Access control lists (ACLs) — Ensuring that the exports, users, and users.local files are correct on the target
agentinfo command is usually a good place to start. Also, ensure that you are familiar with the
blid command, and that you know how to read the rscd.log on the target.
Use these steps to troubleshoot any connectivity issues that you might be experiencing with RSCD Agents in your environment.
Description / Details
Can you ping the host from where you are trying to contact it? For example, can you ping the host from the application server? Can you ping it from your workstation?
Is the agent running?
Log on to the target server and view the process queue with the ps command, and look for the process 'rscd'.
ps -ef | grep 'rsc'
Start Windows task manager and look for the process RSCDsvc.exe
NOTE: If the above processes are not present, check that the agent is installed and then attempt to restart the process. On UNIX this can be accomplished via the startup scripts (usually found in /etc/init.d). On Microsoft Windows use the service control manager and look for BladeLogic RSCD Agent.
Is a firewall blocking access?
Ensure that the target machine does not have the RSCD Agent port blocked by a firewall.
Windows 2008 installs a firewall by default, it cannot be uninstalled but it can be disabled
To disable the firewall in Windows 2008, enter the following command at a command prompt as an administrator
netsh advfirewall set allprofiles state off
Can you telnet to the agent port from the application server or your workstation?
telnet hostname 4750
Is the agent running on the port that you expect? (typically 4750)
netstat -an | grep 4750
Also, check the secure file on the remote host to see if the agent port is configured to be 4750.
Is the agent running on the right port? What do you get when running agentinfo from the application server or your workstation?
To help determine the cause of discovered issues, see RSCD Agent Error Messages.
What is in the RSCD agent logs on the remote host? This should help you determine what the remote agent is seeing as the incoming connection.
Does the information in the RSCD agent log files on the remote host match what appears in the exports, users, and users.local files on that host?
For example, is the exports file limiting the request to a specific host? Is the user that is making the connection listed in the users or users.local files?
Is the system running Terminal Service or is it a Citrix Server?
You have to change to Application INSTALL mode from EXECUTE mode. Use the following steps:
- From a command prompt:
change user /install
- Check it with the following:
change user /query
It should indicate that the system is now in Application INSTALL mode.
- Complete the normal install process.
- Change back to Execute mode
change user /execute
Is ePO/HIPS installed? Could it be blocking the agent?
Use the following steps:
- Use NETSTAT to see if the port is listening (see above for instructions).
- Use NMAP to check the port - see above, if it shows filtered then it might be blocked
Open the HIPS log and check whether port 4740 is getting blocked. If yes, then HIPS is the problem. If not, check to see if the firewall is blocking the ports because of policy.
If the problem is HIPS related, it is probably caused by the server having a static IP and not having a WINS server defined. If the WINS server is not defined, HIPS fails and blocks the ports.
Click here to see a sample ipconfig output is returned when the WINS server is defined.
Windows IP Configuration
Host Name . . . . . . . . . . . . : hou-jadair-01
Primary Dns Suffix . . . . . . . : adprod.bmc.com
Node Type . . . . . . . . . . . . : Hybrid
IP Routing Enabled. . . . . . . . : No
WINS Proxy Enabled. . . . . . . . : No
DNS Suffix Search List. . . . . . : bmc.com
Ethernet adapter Local Area Connection:
Connection-specific DNS Suffix . : bmc.com
Description . . . . . . . . . . . : Broadcom NetXtreme 57xx Gigabit Controller
Physical Address. . . . . . . . . : 00-11-43-9C-CD-8D
DHCP Enabled. . . . . . . . . . . : No
Autoconfiguration Enabled . . . . : Yes
IPv4 Address. . . . . . . . . . . : 172.18.80.99(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.255.0
Default Gateway . . . . . . . . . : 172.18.80.253
DNS Servers . . . . . . . . . . . : 192.168.248.53
Primary WINS Server . . . . . . . : 172.17.0.251
Secondary WINS Server . . . . . . : 172.19.0.251
NetBIOS over Tcpip. . . . . . . . : Enabled
- Look at a server on the same subnet as the problem system. Get the network information by using
ipconfig /all, and study the WINS lines.
Then go into the network interface and define the WINS server using the information from the other system
- Restart the agent and try it again.
Did I fix the problem? How can I check?
For one server, the easiest way is:
- From within the console, right click on the server instance and select Properties.
- Use the Lightning Bolt icon in the upper right corner to update agent status if it is incorrect
- If connectivity is restored, Agent Status is automatically set to Agent is alive or Agent is not licensed. Either way, there is connectivity
- If the connectivity is restored, from within the same window, change the IS_ONLINE and IS_DEPLOYABLE server properties to TRUE
- Do not worry about licensing if you are not about to run any jobs against the particular server immediately. The machine will become licensed overnight.
Additional RSCD agent troubleshooting topics
For more information about troubleshooting RSCD agents, see the following topics:
The following BladeLogic ZipKit contains a NSH type 2 script that attempts to analyze the root cause for servers that have AGENT_STATE=Agent is not responding:
Blade ZipKit - Agent Health Status - Root Cause Check