General issues


The section provides information about how to troubleshoot PATROL for Microsoft Windows Server KM general issues.

Issue

Resolution

Gather diagnostic information

To help BMC Support to help you faster, gather the following information about the issue:

  • Version number of the KM.
  • Installation logs from the %USERPROFILE%\Application Data\BMCINSTALL directory. The name of the log file is a combination of the computer name and a time stamp. For example - *C:\WINNT\Profiles\bhunter\Application Data\BMCinstall\BHUNT_1-1005340189.log.
  • Other log files:
    • Microsoft Windows Servers local XPC - The log file is located at the %patrol_home%\log\psx_server_<PID>.log location.
    • Microsoft Windows Servers remote XPC - The log file is located at the %patrol_home%\log\psx_server_remote_<PID>.log location.
    • PATROL Event Manager - From the PATROL console, right-click the host and select Event Manager. The PATROL Event Manager shows all of the PATROL related events for the host. You can check here to determine if NOTIFY_EVENTS are being generated.
    • PATROL Diags - From the PATROL console, load KM PSX_APPLICATION_DEBUG and right-click Application Trace icon > KM Commands > Create Diagnostic Report.

Windows services configuration including services are not getting monitored

By default, the PATROL Agent monitors the availability of all system services except those whose startup type is Disabled. It generates alerts for all the automatic services when they are stopped.

  1. Check the status of the service that did not generate an alert.
  2. Confirm the KM version if the services are not getting monitored.
  3. PATROL for Microsoft Windows Servers version 4.9 does not monitor all the automatic services by default. Upgrade to the latest version of the KM.
  4. Confirm that the PATROL Agent user account has all the required access.
  5. For more information about configuring services using regular expressions, see Support for regular expressions.

  6. Make sure that the services are accessible from PATROL Agent server and configured.

Verifying access to services from PATROL Agent: 

  • Go to Start > Run > services.msc.
  • Right click Services(local).
  • Click Connect to another computer.
  • Click Browse, enter the hostname of another computer, and click Connect.
  • Confirm if you see the required services in the services console on PATROL Agent.
  • If the PATROL Agent does not have access to remote server services, contact your system administrator. 

Windows Process monitoring

By default, PATROL Agent does not monitor any processes. The following options are available to configure process monitoring:

  • Manual - Select the process to be monitored by the PATROL Agent and you customize how the PATROL Agent monitors it.
  • Automatic - Monitor a process only if it exceeds a CPU utilization percentage.

To monitor processes, the PATROL Agent must have access to this hive and all sub-keys: HKLM\SOFTWARE\Microsoft\WindowsNT\perflib. Use regular expressions to configure a process name. For more information, see Adding-processes-for-monitoring. In the Patrol3 folder, check the psx_server.log file for any process-related errors.

CPU or memory consumption by psx_server.xpc is high

Psx_server.xpc is required to monitor Windows event logs and processes.

  1. Update the PATROL Agent to 10.0 or later and the Windows KM to 5.0 or later.
  2. Check the psx_Server.xpc log for any issues related to corrupt performance counters.

    Example:
    [20200124150058.788] (PID= 1660) (TID= 1e6c) --ERROR-- [CPerfCounterDatabase::GetPerfIndex] Object Process not found
    [20200124150058.788] (PID= 1660) (TID= 1e6c) --ERROR-- [CPerfDatasetImplEx::_Initialize] Object Process is not a valid performance object
    [20200124150058.788] (PID= 1660) (TID= 1e6c) --ERROR-- [CPerflibModule::CreatePerformanceDataset] Unable to open performance query for object 'Process', counter 'ID Process': Error = 1168
    [20200124150058.788] (PID= 1660) (TID= 1e6c) --ERROR-- [psx_perf::OpenPerformanceDataset] Unable to open performance dataset for object 'Process', counter 'ID Process': Error = 1168
    [20200124150058.788] (PID= 1660) (TID= 1e6c) --ERROR-- [pwlrdsdata.Initialize] Failed to OpenPerformanceDataset()
    [20200124150058.788] (PID= 1660) (TID= 1e6c) --ERROR-- [CXpclib::_LoadExtension] Failed to initialize extension: C:\PROGRA~1\BMCSOF~1\Patrol\Patrol3\lib\psl\pwl_rdsdata.dll. Error = 1168
  3. Make sure that performance counters for process are available on the server.
  4. Go to performance monitor and confirm process counters are available.
  5. If any errors are found related to performance counters in the psx_Server.xpc logs, run lodctr /R to rebuild the performance counters.
  6. Confirm if the process counters are disabled by checking the following registry:
    HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\PerfProc\Performance /v "Disable Performance Counters" /t REG_DWORD /d 0
    Make sure that the Disabled Performance Counters value is set to 0. 

Windows Event log configuration and monitoring

You can use only domain account for remote event log monitoring. The Windows KM monitors the events found in the following registry entry only.

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Eventlog
  1. Confirm if a specific event ID is not monitored or all the events are not monitored.
  2. Use the EVENTCREATE command on the Windows Server to configure a test event and monitor it to confirm event log functionality.
  3. Make sure that PATROL Agent has access to the following registry to configure event log:
    HKLM\CurrentControlSet\Services\Eventlog\ 
  4. To configure Windows event log, see Configuring windows event log.

Windows KM is not collecting data

This issue can occur because of multiple reasons. When you face this issue, verify the following:

  1. Check the error logs for any error message.
  2. Check if the PATROL Agent account has required permission (seven access rights).
  3. To fix any kind of performance monitor corruption, run lodctr /R . 
  4. Confirm if all the application classes of Windows KM are not collecting the data or some particular classes only .
  5. Make sure that the KM or its specific application classes are not part of the disabled KM.
  6. Check the KM version and confirm there are no known issues.
  7. Check if multiple instances of patrlPerf.exe are running in task manager.
  8. Kill the unwanted instances of patrlPerf.exe.

Remote monitoring issues

  1. WinRM protocol must be enabled on both PATROL Agent and remote servers.
  2. On PATROL Agent and remote servers, run the winrm qc command to confirm if winRM is configured and a listener is created.  
  3. Confirm the authentication mechanism used for data collection.
  4. Confirm if the following registry for NTLM authentication and set the value to 1.
    HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\system\LocalAccountTokenFilterPolicy
  5. The account should be member of either domain or local administrator group.
  6. Use the domain account for event log monitoring.

Debugging options

  1. On the Add Monitoring Configuration window, from the Monitoring Solution list, select Microsoft Windows Servers.
  2. From the Monitor Profile list, select KM Debug.
    Debug is displayed in the Monitor Type list.
  3. From the list of operating systems (monitor types), select the check boxes for the monitor types for which you want to enable debug logs.
    Logs are collected in the following folder or directory - $PATROL_HOME/PSX/log.
  4. In the Remote host names field, enter a comma-separated list of remote hosts for which you want to enable logs.
    You can also enter a regular expression to identify hosts.
  5. To enable debug logs for clusters, in the Cluster server section, select the Cluster Server check box.
    Logs are collected in the following folder or directory - $PATROL_HOME/mcs/log.
  6. In the Cluster names field, enter a comma-separated list of clusters for which you want to enable logs.
    You can also enter a regular expression to identify clusters.
  7. Click OK and Close.

Unsupported characters

PATROL does not support the following characters and symbols:  

Non-English ASCII characters

  • Comma ( , ). PATROL KM for Cluster Server does not support the comma in object names.
  • Colon ( : ). PATROL uses the colon as an identifier for a specific icon. Creating a PATROL instance name with a colon causes PATROL to use the icon identified by the colon.
  • Slash ( / ). A PATROL instance name cannot contain a slash; however, PATROL KM for Cluster Server replaces a slash with a #.
  • Blank. A PATROL instance name cannot contain a blank; however, PATROL KM for Cluster Server removes the blank before calling a PSL function.
  • Backslash ( \ ) is supported only in conjunction with other special characters (see the PSL Reference Manual); however, PATROL supports object names masked with \\.

While PATROL is foreign language tolerant, domains, groups, users, computers, resources, and similar objects with names that contain non-English characters or the symbols in might not be properly discovered by PATROL. Replacing non-English characters in object names with English characters and removing the unsupported symbols from object names allows PATROL to discover objects properly. You might need to use software other than PATROL to change object names.

Track system resources health

A baseline is a set of data that indicates how system resources are being used. You can then compare this data with later activity to help determine system usage and system response to that usage.

To create a standard Windows baseline, you should monitor resources from the four major server resources: 

  • memory
  • processor
  • disk
  • network objects

Using PATROL, you should consider monitoring the following parameters:

  • NT_CACHE - CACcachCopyReadHitsPercent and CACcachCopyReadsPerSec
  • NT_CPU - CPUprcrInterruptsPerSec, CPUprcrPrivTimePercent, CPUprcrProcessorTimePercent, and CPUprcrUserTimePercent
  • NT_IPX - IPXipxBytesTotalPerSec and IPXipxPktsPerSec
  • NT_LOGICAL_DISKS - LDldDiskQueueLength, LDldDiskTimePercent, LDldFreeMegabytes, and LDldFreeSpacePercent
  • NT_MEMORY - MEMmemAvailableBytes, MEMmemCacheFaultsPerSec, MEMmemPageFaultsPerSec, MEMmemPagesInputPerSec, and MEMmemPagesPerSec
  • NT_NETBEUI - BEUbeuBytesTotalPerSec and BEUbeuPktsPerSec
  • NT_NETBIOS - BIObioBytesTotalPerSec and BIObioPktsPerSec
  • NT_NETWORK - NETniBytesTotalPerSec, NETniOutputQueueLength, and NETniPcktsPerSec
  • NT_PAGEFILE - PAGEpgUsagePercent
  • NT_PHYSICAL_DISKS - PDpdDiskQueueLength and PDpdDiskTimePercent
  • NT_PROCESS - PROCProcessorTimePercent
  • NT_SECURITY - SECsvrErrorsLogon
  • NT_SERVER - SVRsvrBytesTotalPerSec
  • NT_SYSTEM - SYSsysContextSwitchesPerSec and SYSsysProcessorQueueLength

Capture and save the history data from these parameters to find out what is normal for your system. You can use this data to find performance bottlenecks and to analyze future trends for capacity planning. The PATROL History Loader product can help you manage PATROL history data and port it to the database management system of your choice. For detailed information on baselining, see http://www.microsoft.com.

If recovery actions do not execute because the PATROL Agent default account lacks the rights to execute the recovery action.

The built-in recovery actions are enabled but they do not execute. A message indicating that access is denied might be displayed in the PATROL console system output window. Assign local administrator rights to the PATROL Agent default account on the host where you want to execute the recovery action. For more information about the account rights required, see Accounts.

PATROL prompts before running recovery action even though the Do not ask me again option is selected

The process runs with a different process identification (PID) number and appears to PATROL as a different process. It is a known issue.

Workaround - You can configure the recovery action to run automatically instead of with operator confirmation. For more information about configuring recovery actions, see Configuring recovery actions.

Cannot add performance monitor counters with alarm ranges less than 1

The PATROL Wizard for Performance Monitor and WMI does not allow decimal alarm ranges that are less than one, yet the Performance Monitor counters values are normally in this range. To resolve this problem, you can manually multiply or divide the PerfMon counter to get appropriate values for display so that you can set appropriate alarm ranges. As PATROL alarm ranges must be integer values, you cannot create useful alarm ranges if the Microsoft performance monitor counter values are normally less than 1. However, you can multiply the reported value by a specified amount to create meaningful alarm ranges. You can also use this approach if the value reported by the counter is too large. In that case, you would multiply the reported value by a a number less than 1.

  1. Use the PATROL Wizard for Performance Monitor and WMI to create parameters for a Performance Monitor counter, as described in Creating the performance monitor parameters.
  2. Using PATROL Configuration Manager or the pconfig utility, display the following configuration variable:
    /Perfmon/NT_PERFMON_WIZARD/object /Counters
    where object is the Microsoft Performance Monitor object.
  3. Edit the configuration variable value by adding, after the counter name, * multiplier, where multiplier is the numerical value by which you want to multiply the reported value.
    For example, to multiple the reported value of the counter Active Threads by 100, add *100 to the variable, as shown: Active Threads*100.

    If you are monitoring multiple counters for the object, you can also multiple the other counters by a multiplier. For example:
    counter1 *100, counter2counter3*0.1

    Warning

    When entering a multiplier that is less than 1, you must include a leading zero. For example, you must enter 0.1, and not 1.

  4. Apply the configuration change to the agent.
    The value reported by PATROL for the selected counter is adjusted by the multiplier that you entered.

The AdPerfCollector parameter displays error message

When a Windows Server 2003 or Windows 2000 Server computer is promoted to a domain controller (DC), the annotated data point for the AdPerfCollector parameter might display the following error message:

ERROR- Error: WBEM_E_INVALID_CLASS

This error is shown when the required Microsoft Performance Counters are not available in WMI. Follow the instructions in Microsoft Knowledge Base Article 266416 to dredge the performance counters from the registry and make them available in WMI.

Collect DHCP scope logs to troubleshoot issues on issues on Windows Server 2012 and later

  1. From the command prompt, change to the %PATROL_HOME%\bin directory.
  2. Execute the commands and collect the %PATROL_HOME%\log\PtDHCPCollector.log file.

Debug commands -

  • PtDHCPCollector -scopeList true - if scope discovery fails
  • PtDHCPCollector -scopeStats true - if scope collection fails
  • PtDHCPCollector -serverStats true - if collection fails for IPv6 scope statistics of a DHCP server

Unable to see monitoring results

You have configured an infrastructure policy for a monitor type. However, you do not see the monitored device or configuration.

In such a case, make sure that the monitor type that you have configured is active. For this purpose, perform the following steps:

  1. On the computer where PATROL Agent is installed, go to %PATROL_HOME%\patrol3\lib\knowledge\.
  2. Open the KM file for the monitor type that you have configured.
    For example, say you have configured the Share monitor type. So, you need to open the NT_Shares.km file.
  3. In the KM file, search for Applications and make sure ACTIVE is set to True.
    For example:
APPLICATIONS = { 

{ NAME = "NT_SHARES", ACTIVE = True,

SECURITY = False, PROPAGATE_STATE = True,
CREATE_ICON = True, SUSPEND_GLOBAL_PARAMS = False, DISCOVERY_TIME = 300
...}

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*