Issues related to cluster server monitoring

This topic contains troubleshooting information about monitoring of cluster servers.

IssueResolution
Shares are not getting monitoredShared are disabled by default. To enable Shares, include them in the Infrastructure Policy.
Cluster containers are not getting created

Check the KM Configuration Status (ConfigStatus) attribute in the Cluster Monitoring (MCS_Remote) monitor types.

An annotation message is displayed with the detailed description of the issue. If the issue still persists, contact BMC Support. 

Collect logs

Use Debug menu command to enable PSL trace.

Set pconfig variable /MCS_Remote/Clusters/{clustername}/logginglevel to values  INFO and NOLOGGING

By default SEVERE is logging is enabled.

Collect %patrol_home%\mcs\log\MCSCluster_PID.log

Cluster is not getting discovered

Check the KM Configuration Status (ConfigStatus) attribute in the Cluster Monitoring (MCS_Remote) monitor types.

An annotation message is displayed with the detailed description of the issue. If the issue still persists, contact BMC Support.

PATROL McsMonitor (McsService.exe) not listed in the Control Panel Services Applet

One of the following is true:

  • The Cluster Administrator is installed on your cluster-level agent
  • The clusapi.dll file is missing
  • You have not entered a valid KM account

MCS_Clusters application only contain two parameters

No popup to select the clusters to monitor

The MCS_Clusters discovery process executes the cluster/list command to create the cluster list.
However, this command may not return a list of cluster(s) if the cluster.exe command is missing, or it is not in the PATH.
To verify the problem:

  1. Turn on the application class debug for MCS_Cluster.
  2. Run this command from the command line: cluster/list

Check configuration

From the MCS_Clusters application class, double-click the McsCheckConfiguration parameter. Your configuration information, such as cluster connection account and port number, is displayed in a text window.

MCS_Cluster instances are not discovered

Turn on the application class debug for MCS_Cluster. If the debug output shows GetLastError is 5, the user has no connect permission to the cluster. You must grant the permission "Full Control" to the cluster for this user and restart the PATROL Agent. If the debug output shows Failed in Function GetClusterInformation <87>, GetLastError=183, the version of the clusapi.dll is not valid. Make sure the current system’s Service Pack was reapplied after installing the Cluster Administrator.
If the debug output shows <Failed in opening cluster <ClusterName> GetLastError=1722>>, the RPC server is unavailable. Make sure that the cluster service has been restarted on the cluster nodes and that the cluster network name resource is online.

Create a trace session for mcsservice.exe and mcsgateway.exe

To start the PATROL MCS Monitor in trace mode:

  1. From the Control Panel, open the Services screen.
  2. Select PATROL MCS Monitor and click Stop.
  3. In the Startup Parameters field, enter -debug and click Start.
    The mcsservice.exe and mcsgateway.exe are now in trace mode and will write trace files to PATROL_TEMP\mcs if the environment variable is set. If the PATROL_TEMP environment variable is not set, the trace files will be written to PATROL_HOME\tmp\mcs. The file names are mcsservice.dbg and mcsgateway.dbg.
    A trace session must run until the problem occurs. To stop the trace session:stop the PATROL MCS Monitor.
The McsGwConAvailable parameter is continuously in ALARM (is McsGateway operational)The McsGwConAvailable parameter enters a continuous alarm state if you shut down and restart/reinitialize the PATROL Agent while McsService (PATROL MCS Monitor) is running.
To prevent this continuous alarm condition, whenever you shut down and restart/reinitialize the PATROL Agent, stop and restart McsService (PATROL MCS Monitor).
Parameter values are -1

If parameters generate a –1 value, check the PATROL Event Manager for an error message. The user account defined when loading/configuring the KM might not have connect permission to the PATROL Agent on the cluster node. Make sure your user account has the right "log on locally" permission granted on the cluster node.

The ClusterStatus parameter does generate a value of –1.

Any of the following error message is displayed:

  • Device not found
  • Device inaccessible

If there is a failover between cluster nodes, the nodes cannot access shared media via the drive letter. You will receive an error message stating Device not found or Device inaccessible. However, you can still access programs via the program name (i.e., MS Exchange). This error is caused by a known Microsoft problem, tracking number SRX001026603785.
If this situation occurs during installation, the KM is unable to install PATROL Uptime as a cluster resource, and the following parameters do not function: ClusterUptimeCollCluster Uptime, and Cluster Availability.
In order to continue using the PATROL Uptime cluster resource (i.e., you must be able to connect to it in Explorer), you will need to make the resource available to the cluster. To make the resource available to the cluster:

  1. Reboot all the nodes in the cluster.
  2. From a command prompt, change to the patrol_home\bin directory.
  3. Type McsUpTinstall -Install and press Enter.
    This procedure creates a cluster resource named PATROL Uptime which is used for collection of uptime data.

The following error message is displayed in the PATROL Console System Output window

MCS_Cluster, Line# 41: PSL: Error 41 executing XPC[#####]

Turn on trace for the MCS_Cluster application class. You should see the following error message in the trace window:
<Failed in opening cluster <ClusterName> GetLastError=<1722>.>
This error message is related to the operating system and indicates that the RPC server is unavailable.

Was this page helpful? Yes No Submitting... Thank you

Comments