Workflow- Server Restart

This triage, remediation, and validation utility workflow restarts or starts a server host system. A utility workflow is launched on demand from any event, except for a blackout or closed event. Because the utility workflow can be launched from any event – not a specific event that waits for an action to close its status – it requires a validation phase. The validation phase checks whether the action, server restart in this case, has occurred so that the associated task can be closed.

Note

A utility workflow addresses the stopping, starting, or restarting of a server or a service. It is not a specialized workflow and can be invoked from almost any event. Consequently, an incident is not created for a utility workflow.

If the server can be reached by a ping command from the BMC AO host system, the BMC AO host system launches an appropriate restart command for the operating system of the server. 

If the server cannot be reached from the BMC AO host system – that is, the ping attempt fails – then you must launch the Server Start workflow to send a network message command to start the server. For a description of the Server Start workflow, see Worklow: Server Start.

Configuration guidelines

To enable this workflow, you must configure the AutoPilot Credentials Store and the Server_Restart configuration module on the BMC Atrium Orchestrator server side.

Ensure that the credentials of the server host system or systems that are the subjects of the triage, remediation, and validation processes are added to the AutoPilot Credentials Store:

The Server_Restart configuration module also must contain the following definitions: 

Group/ItemDescription
Restart group

Defines the restart FAT commands for the different operating systems. The out-of-the-box commands are listed below: 

Windows_2003: shutdown -r 
Oracle OS: reboot -r 
HP_UX: reboot -r 
Windows_2008: shutdown -r 
Linux: reboot -f 
AIX: reboot -f
Windows_2012: shutdown -r

AO_HostBMC AO host where the Configuration Distribution Peer (CDP) server resides. The restart FAT commands are launched from the AO host system.
WF_Detailed_Logging_FlagEnables detailed logging of this specific workflow in the Infrastructure Management Performance Manager Operations Console. The valid values are use defaulttrue, and false. The use default value applies the true or falsevalue specified in the Detailed Logging File item under the Runbook_Defaults configuration folder that applies to all workflow. You can override this value by specifying its opposite value in the WF_Detailed_Logging_Flag item.
Validation_Pause_Count_MinutesTime in minutes before the validation process begins to verify that the server has restarted or started.

Because the restart and start processes take time, it is necessary to have this delay before the validation process begins. This is a variable time that is dependent on the type of server, the operating system, network configuration, traffic, and so forth. 

The default value for the server restart workflow is 2 minutes.
Wake_On_LanNetworking standard used to send a start signal to the server 

The out-of-the-box command is dir $mac $ip. The underlying script contains the following syntax: wolcmd MACaddress IPaddressSubnetMask Port 

The default SubnetMask value is 255.255.255.255. The default port value is 7. TheWake_On_Lan command is invoked by the Server Start workflow when the AO host system cannot ping the target server.

On the BMC TrueSight Infrastructure Management Server side, you can find the action definitions for the Server Restart workflow in the installationDirectory\pw\server\etc\ cellName\kb\bin\ao_actions.mrl file. 


An extract from the ao_actions.mrl file depicts the action definition of this workflow for the manual, on-demand launch.

#Server Restart Utility Workflow
action 'Atrium Orchestrator Actions'.'Utility - Server Restart':
{
                ['Administrator', 'Full Access', 'Data Collection Administrator', 'Event Administrator', 'Event Operator', 'Data Collection Operator', 'Event Operator Supervisor', 'Data Collection Supervisor']
}
[
'Create Change Request':MC_TRUEFALSE($CREATECHANGERQUEST),
'Change Request Type':MC_CHANGEREQUESTTYPE($CHANGEREQUESTTYPE),
'Server Name' : STRING($HOST)
]
:EVENT($EV) where [ $EV.status != 'CLOSED' AND $EV.status 
	!= 'BLACKOUT']
{ 
action_requestor($UID,$PWD);
opadd($EV, "Triage and Remediate Server Restart", $UID); admin_execute(BEMGW,$EV,"Atrium_Orchestrator_Server_Restart_
	Workflow",[$CREATECHANGERQUEST, $CHANGEREQUESTTYPE,"false","true",$UID,$HOST,"restart"],YES); 
} 
END 

Launching the workflow

You can launch this workflow only as a manual, on-demand workflow from the operator console from any except for a closed or blackout event.

From the Events Console of the operator console, select the event, and choose the Tools > Remote Actions > Atrium Orchestrator Actions > Utility - Server Restart workflow entry. Then, fill in the Execute Actions dialog box. You can refer to the following table to determine which input value to select for the Server Restart workflow.

Input parameterDescription
Create Change RequestBoolean. True/false indicator that shows whether you want to create a change request in BMC Remedy Change Management System. If you choose false, the Change Request Type parameter is ignored.
Change Request TypeString. Specifies the type of change request (normal/preapproved)
Server NameFully qualified host name or IP address of the target server If you do not specify a server name, then the mc_host value of the event is used to populate this field.

Common framework: event processing

You can launch the Utility Server Restart workflow by selecting any event other than a blackout or closed event and then choosing the corresponding Atrium Orchestrator Action. The target server host is represented by the mc_host slot in the event mapping table that the workflow interprets.

After extracting the configuration data from the event, the common framework determines the logging level for the workflow, sets the level at normal or detailed, and updates the event information in the Notes dialog box of the operator console accordingly.

Triage processing

The Utility-Server Restart triage begins with a series of ping attempts launched from the BMC AO host system to the target server. Whether the ping is successful or not determines whether the Server Restart remediation process is launched.

If the host system can ping the server system, but the server process is down, then the Server Restart remediation process begins.

If the host system cannot ping the server system, than a start action may be called for. Return to the Events Console of the operator console, select the event, and choose the Tools > Remote Actions > Atrium Orchestrator Actions > Utility - Server Start workflow entry. (For more information, see Workflow: Server Start.)

Remediation processing

During the remediation processing, the Server Restart workflow receives the alert from BMC ITSM. It extracts the task ID and task information from the alert. It begins the actual remediation by assigning the appropriate FAT command or commands based on the operating system to start the server. When the remediation process concludes, the incident and the event notes are updated.

Validation processing

If the remediation is successful, the workflow starts a validation process. The workflow again extracts the configuration data and updates the event notes. The workflow then goes into a pause state, based on the value assigned to the Validation_Pause_Count_Minutes item in the configuration module. The workflow is waiting for the server restart process to complete.

After the pause count expires, the BMC AO host system again pings the target host. If the target host responds to the ping, then the workflow checks the uptime to see how long the server has been running from the time of the restart. The workflow then sets the validation status and updates the event information. After the validation is complete, the workflow closes the task and updates the event notes.

Was this page helpful? Yes No Submitting... Thank you

Comments