Troubleshooting Virtual Guest Jobs
This topic contains the following sections:
A Virtual Guest Job (VGJ) creates the virtual machine (VM) in the Virtual Center and enrolls it in BMC BladeLogic Server Automation.
VM creation includes both cloning and customizing the VM in the Virtual Center. The VGJ fails if an issue occurs during the cloning or customization phase.
You can find error information about failures in VGJs performed by BMC Cloud Lifecycle Management in the following locations:
- In Cloud Portal, access Advanced Error Information to find error details.
- In csm.log, search for the error reported in Advanced Error Information.
Verify that the error in csm.log is relevant to the failure of the service offering instance (SOI) in question by comparing the name of the SOI in csm.log or matching the time the error occurred.
- On the Virtual Center, access the specific virtual machine in question, and then access Tasks and Events.
- On the Virtual Center server, access bl-vmware.log. (Default Location: C:\Program Files\BMC Software\BladeLogic\8.1\RSCD\vmware\log\bl-vmware.log.) In BMC Server Automation 8.2 and later versions, there is no longer a bl-vmware.log file. All of the logs related to VMware are written to files named blcoserver_.log. (Default Location: C:\Program Files\BMC Software\BladeLogic\RSCD.) Check the latest logs based on time stamp.
- Virtual Guest Job logs from the BMC Server Automation Console — Select the VGJ from the CSM_Virtual_Guest_Jobs folder, right-click on the job, and select Show Results.
- Application Server log — By default, the BMC Cloud Lifecycle Maintenance Install Planner creates two Application Server instances named config_deployment and job_deployment. The VGJ logs are written to the job_deployment instance. For example: C:\Program Files\BMC Software\BladeLogic\8.1\Operations Manager\NSH\br\job_deployment_hostname.log.
Always select the Customize OS check box in the VM Config Type Settings tab of the virtual guest package (VGP).
Virtual Guest Job Execution
When the VGJ starts in BMC Server Automation, BMC Cloud Lifecycle Management polls the status every minute.
23 Sep 2011 01:48:32,124 [INFO] API - [Thread=59536c00-4318-40ad-be92-2141eb864653::2232a3f7-e507-4466-b48a-37db2f1d1d0c(135)] [Class=BaseBBSATaskExecHelper:getJobStatusURI] - Job status found for Windows-4_VG_Job_1316722646899. Getting status URI. 23 Sep 2011 01:48:32,124 [INFO] API - [Thread=59536c00-4318-40ad-be92-2141eb864653::2232a3f7-e507-4466-b48a-37db2f1d1d0c(135)] [Class=BaseBBSATaskExecHelper:pollForCompletion] - Retrieving BBSA job status. Poll count: 1. Status URI = /id/SystemObject/Job/Virtual Guest Job/b8dabe73-8f85-4e9a-aae8-3d4bb64cac46/Statuses/Status365 23 Sep 2011 01:48:32,602 [INFO] API - [Thread=59536c00-4318-40ad-be92-2141eb864653::2232a3f7-e507-4466-b48a-37db2f1d1d0c(135)] [Class=BaseBBSATaskExecHelper:pollForCompletion] - BBSA Job is not complete. Waiting to poll for BBSA job status for /id/SystemObject/Job/Virtual Guest Job/b8dabe73-8f85-4e9a-aae8-3d4bb64cac46/Statuses/Status365
The status poll is controlled by these properties in the providers.json file:
The number of times BMC Cloud Lifecycle Management checks to determine if a BMC Server Automation VGJ is complete before timing out. The default value is 60. It can be increased if necessary.
The number of seconds between polls for the VGJ status. The default value is 60. It can be increased if necessary.
When a VGJ fails in BMC Server Automation, the job status polling is completed and csm.log captures the error reported in BMC Server Automation logs.
Increasing the VGJ timeout
When you request multiple VMs in a SOI, the Virtual Center might require more time to complete the cloning, depending on the performance of Virtual Center. Additional time might cause the VGJ to time out and fail. The default value is 60 minutes.
By default, the property
Dclone-timeout=3600000 is set to one hour in the vmware.properties file on VC Server.
BMC Server Automation versions earlier than 8.2 — VC Server\\C:\Program Files\BMC Software\BladeLogic\8.1\RSCD\vmware\vmware.properties
BMC Server Automation version 8.2 and later — RSCD_HOME\daal\Implementation\BMC_VMware_VirtualInfrastructureManager_win64\win64\vmware.properties
Increase the value of the property to the time required, such as two hours. In that case the value of the property would be
Dclone-timeout=7200000 (milliseconds). Re-start the Virtual Center agent after modifying the value.
Also ensure the relevant properties are modified in BMC Cloud Lifecycle Management as well. Set the following properties to the same amount of time:
Modifying the providers.json file requires a restart of CSM service.
Use the 'bottom up" approach when investigating issues where an SOI request is failing because of issues in the VGJ. Investigate issues at the Virtual Center, then in BMC Server Automation, and finally in BMC Cloud Lifecycle Management.
First try to clone and customize a VM in the Virtual Center directly. This can identify customization issues with a specific operating system or template. If cloning and customization works in the Virtual Center, try to execute the VGJ directly from BMC Server Automation to verify the results.
Disable SOI Rollback on failure
When there is a failure while running any BMC Cloud Lifecycle Management jobs in BMC Server Automation, BMC Cloud Lifecycle Management triggers the de-provision workflow to decommission the VM and free up its associated resources.
In certain situations, you may need to retain the target VM in order to troubleshoot a failure.
To disable the SOI rollback, modify the providers.json file:
Restart the CSM service for the settings to take effect.
This solution should be used only for troubleshooting VM failure scenarios. Revert the settings to the original values after troubleshooting is complete.
Virtual Guest Job errors caused by problematic templates
The following error indicates an issue with the template:
"Error returned from plug-in ; Plug-in: /BMC_VMware_VirtualInfrastructureManager_win64 ; Plug-in function: blAsset_PutAll ; Plug-in asset: BMC_VMware_VirtualMachineTemplate:RUH-002-VCC-004:/Templates/Windows 2008 R2 IIS_Template ; Plug-in error code: 100 ; Plug-in error message: Internal error occurred. Index: 1, Size: 1 Please refer agent log for additional details"
Mostly, VGJ issues are specific to the template being used. Try using a new template to quickly isolate the issue.
VM Deployment to VirtualHost targets
When deploying VirtualMachines to VirtualHost targets, verify the associated VirtualCluster has the DRS option turned off. VM deployment to VirtualHosts is not supported on DRS-enabled VirtualClusters.
The following error is generated while provisioning:
/CLM-Linux-APP; Plug-in error code: 100; Plug-in error message: Internal error occurred.\[Error, Target host cannot used for VM deploy as it is in DRS cluster\]
As a workaround, you can target the cluster instead of deploying VMs on a specific host. In other words, the compute pools created in BMC Cloud Lifecycle Management should be based on clusters instead of ESX hosts.