RSCD Agent Error Messages
The following table lists common errors you may see in NSH, job run logs, or in the GUI when trying to connect to a remote host running an RSCD Agent, along with the possible cause and solution to the problem.
Issue | Description |
---|---|
No authorization to access host | Probably the most common error, this is caused by a mismatch in the ACLs on the remote host and the credentials you are using to connect to the remote host. The files you will want to verify are the users or users.local files on the target, although the exports file may cause the error also. Generally, commenting the Refer to the rscd.log in the agent install directory of the remote host to validate the user trying to establish the connection vs. the entries listed in the ACLs on that server. |
Login not allowed for user | There are several reasons why this error may occur.
|
Permission Denied | There are two possible causes of this issue and they may be related to one or both of the following cases. First, check permissions your role has against that server. In the Server view, right-click on the server you are trying to deploy to and select Properties. Click the Permissions tab and ensure you have write access on that server (for example, "BLAdmins Server.*"). Second, verify the ACLs on the remote host are granting you write access. View the agent ACLs (either the exports, users, or users.local files) and make sure your role and login have read/write (rw) access. |
No Route to Host | For SOCKS Proxy This happens when socks proxies have been configured and routing rules have been enabled, but the socks proxy is either misconfigured, down, or the Application Server cannot reach the socks proxy server. Check the routing rules, verify in the SERVER properties if the particular server has been configured to go through a socks proxy based on the rules, verify that the socks proxy server is up and running. Check the socks proxy log files. For Non-SOCKS Proxy The Application Server maintains a DNS cache. Try this:
Since we cache indefinitely because of the underlying java settings the only way to pick up the new IP would be to restart the Application Server. The cache still honors the TTL on the DNS record so if it expires then we should request it again and then pick up the new value. Also, see if you have a firewall blocking the port, or if your Security Level is enabled or disabled:
|
ERROR IN TLS PROTOCOL / Encryption configuration error | Generally this is caused when the secure file is different on the two hosts making contact. Ensure that both hosts are communicating using the same protocol and encryption levels. Always use the This could also be a problem with the agent itself or an interaction between the OS and the agent. Try different commands like This error can also be caused when Shavlik (which is known to use port 4750) or any other program already using that port is running on the remote host, the same port that the RSCD agent uses. Try the following: Restart the RSCD agent, stop the program listening on port 4750, use netstat to make sure the port is not in LISTEN mode, then restart the agent. This error has also been observed during a mass agent roll out using a silent install on UNIX systems. The issue was that the silent install chose the option of using a random number generating device on the system when one did not exist. To resolve, we reinstalled the agents choosing PRNG and the issue was resolved. |
I/O Error | This sometimes is shown in place of You may also want to try restarting the agent. |
Remote host is unknown | This error will happen when either the application server can't resolve the host, or your client can't resolve the host. Sometimes this is the case when you run a custom command and receive blank/no output. Ensure you can ping the remote host and the server is correctly configured in DNS. |
Connection timed out | You might see this error in the following situations:
|
Connection refused | Generally this error will show up when the remote host is down and/or the agent is not running. It can also happen when there is a mismatch between the port the agent communicates over (configured in the secure file) and the port configured on the agent from the originating connection. |
"SSL error" with "SSL Protocol Mismatch Error" in agent logs | If telnet to the target on the agent port 4750 works, the target server could be part of cluster with their local loopback NICs talking to each other on a full 10. The application server is sending a request to the target server and the acknowledge is being redirected out to those servers's 10.x.x.x network, which, because the application server IP is 10.52.x.x and the route table says anything bound for the 10 network goes that way, it goes to the internal loopback NICs and never gets back to the application server. You need to reconfigure the local loopback network to something besides the 10. |
Error code (2): Invalid XML source | When attempting to license RSCD agent via This error message is an indicator that there is a problem in the agent ACLs. The message in the rscd.log file was that the user account that was being mapped to did not have sufficient permissions to access the necessary directories, and was unable to generate a HostID. Once the ACLs are corrected, If you see this error message, do the following:
|
App Server and client on the same server with App Server certification plus targets on a separate machine | When a customer had an Application Server and the client he was using on the same machine and also did push ( Can't access host "SERVER_XY": Login not allowed for user In the RSCD.log file there will be one line with the following error: Certificate check failed on agents Solution A (Application Server certificate not completely or incorrect installed)Note: This section describes steps that you can also find in this procedure: TLS with client-side certs - Securing a Network Shell client. After implementing App Server to Agent security, the agents will be configured to accept only certified connections, meaning that at that point only the Application Server will be able to communicate with the agents through the console. Stand-alone-NSH will not work anymore, as it does not uses the Application Server's certificate. That's why you will also need to implement NSH to agent security if you want to access agents using NSH. Follow these steps: 1. Delete id.pem from $HOMEDIR/ Application Data /Bladelogic on the machine running NSH. After restarting NSH, it will prompt you for the passphrase entered in step 4. This will certify the NSH session and thus enable it to communicate with the secured agents. Solution B (Application Server certificate installed already)Follow the steps described in TLS with client-side certs - Securing a Network Shell client. Of course, the created client-side certificate has then to be pushed on the Application Server / client. Solution C (Nothing installed yet)Follow these procedures step by step:
After that, you should be able to do an agentinfo <agent> Can't access host "SERVER_XY": Login not allowed for user |
Windows group policy for the agent | The RSCD agent runs under the "Local System" account. For the impersonation to occur the RSCD Agent will "logon" as the BladeLogicRSCD user. Then window API calls are made which apply the appropriate permissions associated with the user you're going to map to. This allows commands to be executed in the context of the 'mapped to' user. However, the underlying running user is still the "Local System" account which doesn't have access to network resources. That "Local System" user cannot connect to remote windows shares. |
Licensing Issues | Note: As of version 8.2, RSCD Agents no longer need to be licensed. The following issues should only arise with 8.1 agents and older. Software Not LicensedThis error will show up if the license on the host is not valid. It could be because the host was never licensed or because it was licensed for a limited number of CPUs and the number of CPU has increased (this mostly happens on virtual machines). The agent takes the following steps to verify the validity of a license:
Agent is intermittently licensed/Not LicensedThis usually happens when an agent is running on a VM or any system with dynamic CPU allocation. We generate 1 license for a certain block of 4 CPUs. If you licensed the server agent when it was allocated 1 or 2 CPUs, it will generate that license for 3 CPUs. Next time you try to access that server, if it is under load, it might have been allocated 5 CPUs. In this case, it becomes unlicensed. And the second after, it could be back to 3 CPUs again and become licensed again. The actual validity of a license is based on this: 1-3 CPUs, 4-7 CPUs, 8-11 CPUs. So a license valid for a machine with 1 CPU will still be valid regardless of whether that machine has 1, 2, or 3 CPUs. Once it's upgraded to 4 though it will no longer work. And a 4 CPU license will be good for 4, 5, 6, or 7 CPUs but not 8. It is different when you go in the reverse direction. If you are licensed for 8 CPUs and you take out 7 of them, that license will still work on a 1 CPU machine. To determine the number of CPUs, we use a rather complicated algorithm which goes deep inside the hardware, so even if the OS doesn't provide this info, our agent is able to get it (that is, do not rely on what the OS in the VM shows you). If you have dynamic CPU allocation, you should generate a license for the maximum number of CPUs the system can be allocated. Use getlic to generate a license.raw. # getlic localhost # cat license.raw localhost 1 48897F18 The number in the middle is the number of CPUs. Change that number to Max_nb_of_cpus (for ex 32) and license this host on the Bladelogic licensing portal. Alternatively you can use autolic with the '-c' option for force a cpu count. Use putlic to put the license.dat on the server. Also, if you're licensing using autolic, newer versions (7.4.3+) have a flag (-c) to set a different CPU count than the one the agent retrieves. Agent shows up as licensed with agentinfo but not from the application server point of viewThe agent is licensed according to agentinfo, but not according to the application server (job logs). Most likely, the SERVER property AGENT_STATUS has not been updated. This can be achieved either through the update icon on the SERVER Properties or through an update server properties job. You can also set AGENT_STATUS for many servers with the update-server-agent-status.nsh NSH script in the .../samples/blcli directory |
Comments
Log in or register to comment.