Troubleshooting upgrade issues
The preferred method for upgrading a Patch, Service Pack, or version is from the appliance’s UI. For more information, see Performing-the-upgrade.
Resolving upgrade issues
You must be logged in to the appliance using a screen session. Enter:
See Recovering from a lost connection using screen.
You can perform the following upgrades on standalone machines and clusters either from the BMC Discovery UI or by running the tw_run_upgrade command line utility:
- Upgrade to a BMC Discovery Service Pack or a version later than 10.0
- Upgrade the Operating System
tw_run_upgrade capabilities
The tw_run_upgrade utility is an interactive command line tool. Based on the upgrade issue you have, it informs you about the option you must run the utility with and any additional action you are required to perform to resolve it. To learn about some of the typical scenarios where you will use the tw_run_upgrade utility, see Resolving an incomplete upgrade process.
Running tw_run_upgrade
Before you start an upgrade, make sure that you have downloaded the compressed upgrade archive from the BMC Electronic Product Distribution (EPD) site and copied that to the /usr/tideway/var/upgrade directory of the machine from which you will run the upgrade.
To run the upgrade using the tw_run_upgrade utility, you must login to the machine from which you will run the upgrade as the tideway user and type the following command:
where options are any of the options described in tw_run_upgrade options and the common command line options described in Using-command-line-utilities. You are prompted for the password corresponding to the user specified in the username option.
tw_run_upgrade options
The following table describes the tw_run_upgrade utility options which are not listed in Using-command-line-utilities:
Resolving an incomplete upgrade process
The following section contains user examples of some typical scenarios where you will use the tw_run_upgrade utility in a standalone machine and a cluster:
User example for a standalone machine
If the upgrade process is interrupted on a standalone machine, the command line directs you to run the upgrade process again by running the tw_run_upgrade --start command.
User examples for a cluster
Using the restart command option: While running the upgrade, if the upgrade process is interrupted on one of the machines in the cluster after the services were stopped, it puts that machine into a locked state and prevents the other members of the cluster (where the upgrade has completed) from starting. If you attempt to start the services manually from the command line of the machine where the upgrade was interrupted, it directs you to unlock it and resume the upgrade process by running the following command:
sudo tw_run_upgrade --username=system --restartRunning the command unlocks that machine and resumes the upgrade process. Once the machine is upgraded and it reboots, all the machines in the cluster will also start.
Using the --fix-interrupted command option: While running the upgrade, if the upgrade process is interrupted on one of the machines in the cluster before the services were stopped, it puts that machine into a locked state and displays an error message on the machine from where you are running the upgrade:
Member no longer aware of ADDM Upgrade operation
The interruption stops the services only on the machine where the upgrade was interrupted. If you attempt to start the services on that machine manually from the command line, it directs you to unlock it by running the following command:sudo tw_run_upgrade --username=system --fix-interruptedRunning the command unlocks that machine. The interactive command line tool informs you if any additional intervention is required before you can run the --start option and run the upgrade again for that machine.
Post upgrade steps
Messages in the upgrade log
During the upgrade the firewall (iptables) is restarted. When a kernel upgrade is part of the upgrade, the firewall is unable to restart as there is a mismatch between the running kernel's version and the kernel on disk. The firewall logs a FATAL message, but as this is entirely expected, the upgrade wraps it in an information message:
2011-07-25 09:36:46: INFO: FATAL: Could not load /lib/modules/2.6.18-53.1.14.el5 /modules.dep: No such file or directory
This is expected behavior and does not indicate a problem with the upgrade.
Recovering from a lost connection using screen
If you lose the connection to the appliance during the upgrade and you have used screen, you can reconnect to the appliance and recover the virtual terminal running the upgrade. To do this:
- Reconnect to the appliance and login as the tideway user.
List the current screen sessions. Enter:
[tideway@appliance01 ~]$ screen -ls
There is a screen on:
23274.pts-0.appliance01 (Detached)
1 Socket in /var/run/screen/S-tideway.You can re-attach to it with a simple command:
[tideway@appliance01 ~]$ screen -rThe virtual terminal is recovered and you can see how the upgrade is progressing.