Contributor content
This topic was created by a BMC Contributor and has not been approved. More information.
The following topics apply to troubleshooting issues with the agent:
Two utilities are available for administering the RSCD agent in the <BMCServerAutomationInstallation>/RSC directory: agentctl
and secadmin
.
agentctl
, you can start, stop, restart, kill, or pause the agent. The restart option is used during the upgrade process and allows for the ability to bring down the agent, perform a command, and start the agent back up. For an agent on a Windows domain controller, you can also use agentctl
to change the password of the RSCD agent user acccount. For more detailed information about the agentctl
command, see the related man page.secadmin
, you can create the secure file from the command line. For more detailed information about the secadmin
utility, see Configuring the secure file.For the agent to work properly, all hostnames specified in the configuration files under the rsc directory must be able to resolve the hostnames. If DNS is not configured, modify the hosts file or use the agent's IP address.
The patch analysis function of BMC Server Automation requires the Microsoft XML (MSXML) parser version 6.0 SP2 or later to be installed on the server on which the RSCD agent is installed. You can install the RSCD agent on a computer on which MSXML is not installed, but patch analysis does not function correctly until MSXML 6.0 SP2 or later is installed. Run a live audit on target agents to determine its presence, then download the appropriate SP from the Microsoft site and deploy it using BMC Server Automation.
The following processes run on each agent, depending on the operating system. Additional processes might be running, if there are jobs running on the Agent, or if jobs have not exited properly. Processes that did not properly exit can be killed, to ensure that the agent can be restarted.
Operating system | Processes | Additional details |
---|---|---|
Windows |
| RSCDSvc is in charge of restarting the Agent process whenever it shuts down. RSCDSvc attempts to restart the Agent process up to 50 times. After that, if the service does not succeed in restarting the Agent, it stops attempting to restart the Agent and shuts itself down. |
Linux or AIX |
| The first process to start is the Agent watcher process, rscw (on Linux and AIX) or rscd (on HP-UX and Solaris). The watcher process spawns and monitors the other two processes, the Agent listener and the Agent logger, both named rscd. Whenever a listener or logger process shuts down, the watcher process tries to restart it. Unlike on Windows, there is no fixed count after which the Agent watcher process shuts itself down. It remains functional until it is killed by the superuser or by the operating system as part of a system shutdown or restart. |
HP-UX or Solaris |
|
On all OS platforms, an agent log records all transactions between the application server and the agent. This does not mean that everything that appears on the agent (such as script output that is written to stdout) appears in the logs; however, all commands issued from the application server are logged by the role:user who executed them, a date/timestamp, and the actual command.
The log file can be viewed using the logman
command from the NSH command line.
On Windows, the rscd.log file can be found in <BMCServerAutomationInstallation>/RSC. On UNIX, the file is located in the .../RSC/log directory. In these same directories, you might see files named rscd.log1, rscd.log2, ..., rscd.logn. Each time an agent goes down and reboots, the log file is saved off, numbered, and a new one takes its place.
For rollback information and job logs, view the logs in the .../RSC/transactions directory.
For more information see Controlling agent logging with the log4crc.txt file.
BMC recommends that you start and stop the RSCD Agent using the procedures in the table below. The UNIX rscd script executes the ps
command on the agent and greps
for the rscd process. If found, it executes the kill
command on the process. The UNIX script does not use the agentctl
command; however, the user can alter the script to do so.
In addition to the information below, users can also use the agentctl
command on both platforms to manage the RSCD Agent.
Platform | Stop Server | Start Server |
---|---|---|
UNIX | /etc/init.d/rscd stop | /etc/init.d/rscd start |
Windows | Stop BladeLogic RSCD Agent Service | Start BladeLogic RSCD Agent Service |
Due to a bug in the Windows agent, it will not always stop cleanly when shutting down.
Various executables (in particular rscdsvc.exe) can be locked by system monitoring tools.
You have just installed an RSCD Agent and it does not start. Also, no logs are showing up in the rscd.log file, or perhaps it is not being created.
To troubleshoot this problem, validate your hosts file. Assuming that your server is called myserver and it has an IP address of 192.168.0.9, you will want to see something like this:
127.0.0.1 localhost 192.168.0.9 myserver
If you don't have an entry for myserver (or if you have a typo), your RSCD agent might not start.
If the Windows agent is in the process of handling any type of command when you stop it, the agent might not stop cleanly. If the agent does not stop cleanly, when the upgrade finishes and tries to restart the agent, the restart will fail.
After stopping the agent, execute netstat -an
If you see a line similar to the one below, then the agent did not stop cleanly.
TCP 0.0.0.0:4750 0.0.0.0:0 LISTENING
Try performing the following actions:
After running the installer, you can check for the log file in C:\<WINDIR>\Temp\RSCD-Install.log.
In this problem scenario, the agent's executable files are locked, and the agent cannot start.
A good tool to use is Handle, which can be found at http://www.sysinternals.com. You can run this command against the locked executables and it should return the process that has it locked.
If Handle does not tell you anything, check the following to see if there is a lock on it:
Prior to 8.1, the BladelogicRSCD user used to have a hard-coded default password. However, since 8.1 this password is randomly generated and is stored in the registry key HKEY_LOCAL_MACHINE\SECURITY\SAM\BladeLogic\Operations Manager\RSCD as 'E' and 'S' values. Also, in prior versions, if the password for this user was changed using chapw
, the same location has a value 'p' with the changed password.
Now, if for any reason the user exists with either random password or changed password but the registry keys that store the password are deleted or the keys are not created to store the password, the agent will not start and throw the same error that you are seeing. If the user does not exist, we recreate the user and registry entries and that would explain why it succeeds once the user is deleted.
So you want to check the registry key and values once it starts up to ensure that the registry keys are created. Also, if you see BladeLogicRSCD@BL-WINWWW in the logs, the machine BL-WINWWW might be on a domain. If it is a domain controller in a multi-master environment, it may have something to do with the error.
Upgrade of an agent is not an issue with these changes, though, as the newer versions of the agent are backward-compatible.
When a deploy job runs on an agent, the job is executed, and content and XML instruction files for rollback are pushed to the agent and saved. The content and instruction files are saved in the transactions directory along with the job log files (see Logging for more information).
Because the content and instruction files must reside on the agent in the transactions directory in order for the user to roll back the deploy job, files that are associated with an installation that may need to be rolled back should NOT be deleted. For those jobs that do not require rollback, either configure the job not to allow rollback, or delete the appropriate content and instruction files from the transactions directory.
Files left behind in the temporary or staging directory can be deleted at any time. The staging directory for each server is set in a server's property list. You can view the property via the Configuration Manager console.
Before installing the agent on a Red Hat Enterprise Linux 5.0 platform, make sure that the SELINUX setting in /etc/selinux/config is set to disabled
. If it is not, change the value to disabled
and reboot the server prior to the installation.
If a VMFS file system has a space in its mount point name, then that mount point may not appear properly under the File System node in the BMC Server Automation Console.
Setting the agent log rolling size limit to a low value (such as 100KB) can cause problems with the secure logs, including some not getting signed.
Hardware Information Snapshot and Audit Jobs run for a longer time when multiple objects in the hierarchy are selected with the recursive option.
Workaround: For better performance, do one of the following:
Select one or more of the leaf nodes needed for the snapshot and clear the Recurse subfoldersoption if it was selected. (This option is disabled by default for leaf nodes.)
Hardware Information Object system commands invoked by UNIX objects (for example, UnixUsers and UnixGroups) might produce messages that seem like errors but do not prevent the UNIX object from doing its job. Umbrella facilities such as SELinux can occasionally inject into system commands errors or warnings that are associated with their own corrupted configurations rather than with the system command. To determine whether a message represents a real problem (assuming that the UNIX object's action was successful), use the following methods:
Run the same command from the command line. If the same error or warning appears but the command succeeds, the issue is not likely to be related to the UNIX object.
The console can freeze if you run a Network Shell script that uses an nexec command to run a PowerShell command on an agent or you browse an extended object that is defined to run a PowerShell command on an agent using remote execution. The freeze always occurs when using PowerShell version 1 and sometimes with PowerShell version 2.
Workaround:Run the PowerShell command directly on the agent using a command line interface or upgrade to PowerShell version 2 and invoke the PowerShell command using the "-InputFormat none" flag.