Troubleshooting Agent Installer Job issues


An Agent Installer Job (AIJ) is used to perform either a fresh install of an RSCD Agent on a Target Server or upgrade the existing RSCD Agent to a later version. 

This topic helps you to investigate and troubleshoot AIJ failures.

Issue symptoms

An Agent Installer Job fails with an error message in the job run log.

Issue scope

The issue might occur on all targets against which the AIJ is run or might be limited to a subset of the targets.

Diagnosing and reporting an issue

Task

Action

Steps

Reference

1

Understand the problem scope.

  • Verify whether the AIJ job is used to install a new Agent or upgrade an existing Agent.
  • Verify whether this issue affects all the AIJ targets or subset of targets.
  • Verify whether the issue occurs on targets with specific operation systems and versions or all operating systems?
  • Verify whether the Agent installation was successful in spite of errors.
  • Verify whether the problem is consistent or intermittent. Sometimes, the issue is resolved after rerunning the AIJ.


2

Determine whether the RSCD Agent was successfully installed or upgraded in spite of errors.

Sometimes, the AIJ might return an error or warning but the target agent is successfully installed or upgraded.

If the AIJ is showing as "completed with errors" against a target, check whether the RSCD Agent is successfully installed upgraded.

Use RDP (Windows) or SSH (Unix/Linux) to verify the version of the installed Agent.

Windows:

Check whether the RSCD Agent is listed under Programs and Features. If listed, note the value of the Version column on the right.

image2020-9-24_10-55-23.png

Note: RSCD versions prior to 8.9.03 will be listed as BladeLogic Server Automation RSCD Agent instead of TrueSight Server Automation RSCD Agent.

Linux RH:

[root@]# rpm -qa | grep BladeLogic

image2020-9-24_10-54-49.png


AIX BFF:

lslpp -L | grep bladelogic

image2020-9-24_10-54-20.png

To confirm the version of Linux/Unix agents, check the following file:

image2020-9-24_10-56-33.png

3

Locate the AIJ Job run log.

Do the following:


    1. Log in to the TrueSight Server Automation console.
    2. Right click the AIJ and select Show Results.
    3. Navigate to the failed job run.
    4. Drill down to the failed node.
      - Agents for which the AIJ attempted to perform an initial install will be listed under the Install node.
      - Agents for which the AIJ attempted to perform an upgrade will be listed under the Upgrade node.
      - Failed targets will be grouped together by distinct error message
    5. Under the error message, navigate to the specific server you want to troubleshoot.
    6. Review the log messages on the right pane. Double-click on a specific log message to get more details.

Step 3b:

image2020-8-25_18-13-10.png


Step 3d:

image2020-9-24_11-15-19.png


Step 3f:

image2020-8-25_18-19-7.png

4

Export the AIJ log package

From the failed Job Run (Step 3d):


    1. Right click on "Run at mm/dd/yyyy hh:mm:ss" and select "Download Log Package".
    2. Select a location to store the log package.
    3. Name the zip file.
    4. Click Save.
    5. Select the Servers you are interested in and move to right pane
    6. Click OK. A confirmation message is displayed.

Note: This zip file contains the following:

  • Export of the Job run log in .csv format
  • Application Server log file entries from the time when the job was running
  • Zip file of rscd.log (include rollover logs) from target(s)
  • Zip files of Application Server log file(s) from all Application Servers

5

Analyze error(s) from AIJ Job Run Log

Do the following:

  1. Review the error message gathered from the AIJ Job run log (step 3).
  2. Review the information in the "Resolutions for common issues" section to review common errors that can result in AIJ failures and how they can be typically resolved.
  3. If you are unable to identify and resolve the problem, create a BMC Support Case.


6

Creating a BMC Support Case

Provide the following information and log files when creating a case with BMC Customer Support:

  • Scope of the issue as identified in step 1. Include name(s) of failed targets
  • Status of Agent installation
  • Provide the logs captured in step 4
  • Provide the error message identified in step 5


Resolutions for common issues

Symptom

Action

Reference

The following error message is seen in the rscd.log file after running the AIJ to upgrade the RSCD agent.

ERROR MESSAGE:
b6100181f4666ede42ab 0000000020 09/27/23 10:10:35.822 ERROR rscd - phx-hsmops-01 2572 SYSTEM (Not_avai|able): (Not_available): SSL error : ssl\t1 _lib.c:3304 error:0A000076:SSL routines::no suitable signature algorithm 
8adcff8c7e51948a2e99 0000000021 09/27/23 10:10:35.822 ERROR rscd - phx-bsmops-01 2572 SYSTEM (Not_available): (Not_available): SSL_accept
f13ac3f790da75e5fe30 0000000022 09/27/23 10:10:35.822 ERROR rscd - 172.24.8.76 2572 SYSTEM (Not_available): (Not_available): new_connectionjîostfork: SSL finish error.

This error occurs when running the AIJ to upgrade the RSCD agent from an older version to a newer version, and there is a change in the key size of the certificate.pem file.

If your certificate.pem certificate was created with a key size of 1024 bits (or less), you will need to regenerate it with a higher key size (2048 or higher). This is due to a change in the FIPS requirement for minimum key length (now 2048 bits).

This mismatch in key sizes can disrupt the upgrade process and requires the following corrective actions to ensure a successful upgrade.

  1. Delete the /etc/rsc/certificate.pem file.
  2. Restart the RSCD agent service for the changes in the configuration file to take effect. 


ERROR MESSAGE:
All Remote Host Authentications Failed Validation
JOB RUN LOG:
Remote host authentication 1 of 1 'UNIFIED_AGENT_INSTALLER_20.02_3' failed to validate against server '<hostname>' due to: Failed to connect: <hostname>: java.net.SocketException: Connection reset

Verify whether SMB is enabled on the target server and the PXE server is installed (Only applies to Windows Targets).


    1. RDP to the target server.
    2. Open power shell.
    3. Run this command to detect SMB:
      Get-SmbServerConfiguration | Select EnableSMB2Protocol
    4. Run the following command to enable SMB:
      Set-SmbServerConfiguration –EnableSMB2Protocol $true

ERROR MESSAGE:
No Rule Defined For Server

JOB RUN LOG:
All remote host authentication routing rules evaluated to false for server '<hostname>'.

Do the following:


    1. Navigate to Configuration > Infrastructure Management > Remote Host Authentication Routing Rules.

      image2020-9-27_18-56-55.png

      1. If no rules are defined, create one using instructions in the referenced documentation.
      2. If a rule is defined, right-click the rule and select Properties.
    2. On the Rule Definition tab, confirm that the condition is valid for the target.

      image2020-9-27_18-58-28.png

    3. If it is valid, confirm that a Remote host authentication is selected on the Remote Host Authentications tab.

      image2020-9-27_18-58-57.png

    4. If it has a valid Remote host authentication, close this popup, right-click the specified remote host authentication name and select Properties.

      image2020-9-27_19-10-48.png

    5. Confirm that remote host authentication is valid for OS from the target.

      image2020-9-27_19-12-32.png

    6. If it is valid for this target, confirm the Automation Principal selected on step above has correct credentials under RBAC Manager > Automation Principals.

      image2020-9-27_19-14-7.png

ERROR MESSAGE:
Agent Configuration Error

JOB RUN LOG:
<Role>:<User> has no authorization to access the host <hostname>.  It is possible that authorization to connect may be missing from either the exports file, users file, or users.local file on the agent.  It is also possible that the secure file on the agent is configured for additional levels of authentication than what the appserver is configured for.

Agent ACLs (export, users or users.local) files are incorrect on the target.


    1. RDP or SSH to the server.
    2. Go to rsc
      (Windows) C:\Windows\rsc
      (Linux) /etc/rsc
    3. Open the export file.
      It should have one entry to allow access from the Application Server. For example:
      * rw
      Where * indicates that all connections coming from any server or it can have only host name of Application Servers.
    4. Open users and users.local.
      It should have an entry for your user and role
      For example:
      BLAdmins:Bladmin rw, map=Administrator

ERROR MESSAGE:
SSH Connection Failed

JOB RUN LOG:
Remote host authentication 1 of 1 '<Remote Host Authentication Name>' will be skipped because the execution protocol 'SSH' is not valid for the agent platform 'Windows 64-bit'.

ERROR MESSAGE:
SSH Connection Failed

JOB RUN LOG:
Remote host authentication 1 of 1 '<Remote Host Authentication Name>' failed to validate against server '<ServerName>' due to: Failed to connect to SSH port 22:  Connection refused: connect

Do the following:


    1. Go to Configuration > Infrastructure Management > Remote Host Authentication.
    2. Select the <Remote Host Authentication Name> name you see in the log message, right click the rule and select Properties.
    3. On the Rule Definition tab, confirm that the condition is valid for the target.
    4. Review and modify the Command Execution Protocol if it is incorrect.
    5. If the Command Execution Protocol is correct for this Remote Host Authentication, review the Remote Host Authentication Rule to ensure that it is evaluating to the expected Remote Host Authentication.

ERROR MESSAGE: Invalid Username/Password or Key

JOB RUN LOG: Remote host authentication 1 of 1 '<Remote Host Authentication Name>' failed to validate against server '<ServerName>' due to: Invalid username or password when connecting with user '<UserName>'.

Do the following:


    1. Navigate to Configuration > Infrastructure Management > Remote Host Authentication Routing.
    2. Select the <Remote Host Authentication Name> name you see in the log message, right click the rule and select Properties.
    3. Confirm the name of the selected Automation Principal.
    4. Navigate to RBAC Manager, and select the Automation Principal identified on Step 3 and open.
    5. Confirm that Principal ID is correct and retype the passphrase.


ERROR MESSAGE:
PsExec Not Found

JOB RUN LOG:
Remote host authentication 1 of 1 '<Remote Host Authentication Name>' failed to validate against server '<ServerName>' due to: PsExec is either not installed or could not be found in the path on PsExec Server '<PsexecServerName>'.

Do the following:


    1. Navigate to Configuration > Infrastructure Management > Remote Host Authentication Routing.
    2. Select the <Remote Host Authentication Name> name you see in the log message, right click the rule and select Properties.
    3. Verify the Psexec server name selected.
    4. RDP to the Psexec Server.
    5. Open cmd on server as Administrator
    6. Run the following command: psexec

      Output

      PsExec v2.2 - Execute processes remotely
      Copyright (C) 2001-2016 Mark Russinovich
      Sysinternals - www.sysinternals.com
      PsExec executes a program on a remote system, where remotely executed console
      applications execute interactively.
      Usage: psexec [\\computer[,computer2[,...] | @file]][-u user [-p psswd][-n s][-r
       servicename][-h][-l][-s|-e][-x][-i [session]][-c [-f|-v]][-w directory][-d][-<p
      riority>][-a n,n,...] cmd [arguments]
           -a         Separate processors on which the application can run with
                      commas where 1 is the lowest numbered CPU. For example,
                      to run the application on CPU 2 and CPU 4, enter:
                      "-a 2,4"
           -c         Copy the specified program to the remote system for
                      execution. If you omit this option the application
                      must be in the system path on the remote system.
           -d         Don't wait for process to terminate (non-interactive).
           -e         Does not load the specified account's profile.
           -f         Copy the specified program even if the file already
                      exists on the remote system.
           -i         Run the program so that it interacts with the desktop of the
                      specified session on the remote system. If no session is
                      specified the process runs in the console session.
           -h         If the target system is Vista or higher, has the process
                      run with the account's elevated token, if available.
           -l         Run process as limited user (strips the Administrators group
                      and allows only privileges assigned to the Users group).
                      On Windows Vista the process runs with Low Integrity.
           -n         Specifies timeout in seconds connecting to remote computers.
           -p         Specifies optional password for user name. If you omit this
                      you will be prompted to enter a hidden password.
           -r         Specifies the name of the remote service to create or interact.
                      with.
           -s         Run the remote process in the System account.
           -u         Specifies optional user name for login to remote
                      computer.
           -v         Copy the specified file only if it has a higher version number
                      or is newer on than the one on the remote system.
           -w         Set the working directory of the process (relative to
                      remote computer).
           -x         Display the UI on the Winlogon secure desktop (local system
                      only).
           -arm       Specifies the remote computer is of ARM architecture.
           -priority  Specifies -low, -belownormal, -abovenormal, -high or
                      -realtime to run the process at a different priority. Use
                      -background to run at low memory and I/O priority on Vista.
           computer   Direct PsExec to run the application on the remote
                      computer or computers specified. If you omit the computer
                      name PsExec runs the application on the local system,
                      and if you specify a wildcard (\\*), PsExec runs the
                      command on all computers in the current domain.
           @file      PsExec will execute the command on each of the computers listed
                      in the file.
           cmd            Name of application to execute.
           arguments  Arguments to pass (note that file paths must be
                      absolute paths on the target system).
           -accepteula This flag suppresses the display of the license dialog.
           -nobanner   Do not display the startup banner and copyright message.
      You can enclose applications that have spaces in their name with
      quotation marks e.g. psexec \\marklap "c:\long name app.exe".
      Input is only passed to the remote system when you press the enter
      key, and typing Ctrl-C terminates the remote process.
      If you omit a user name the process will run in the context of your
      account on the remote system, but will not have access to network
      resources (because it is impersonating). Specify a valid user name
      in the Domain\User syntax if the remote process requires access
      to network resources or to run in a different account. Note that
      the password and command is encrypted in transit to the remote system.
      Error codes returned by PsExec are specific to the applications you
      execute, not PsExec.

    7. If the following output is received, psexec is not installed correctly.
      'psexec' is not recognized as an internal or external command, operable program or batch file.
      Do the following to install psexec:
      1. Download PsTools from the following website:

        https://docs.microsoft.com/en-us/sysinternals/downloads/psexec

      2. Run the following command:
        C:\Windows\PsExec.exe
      3. Read the agreement and agree to the terms.
      4. Run the psexec command again to confirm that it is working.

ERROR MESSAGE:
SMB Access Denied
JOB RUN LOG:
Remote host authentication 1 of 1 '<Remote Host Authentication Name>' failed to validate against server 'ServerName' due to: Received an 'Access Denied' error trying to access 'smb://<UserName>@<ServerName>:/C$/'.  Please make sure '<UserName>' is an administrative user and has access to the requested location.

The User mapped to by the specified Automation Principal needs to be part of the Administrator Group on the target server.

Add the user to the Administrator group on the target server or change the user used by the Automation Principal.

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*