Troubleshooting


The following topics describe basic features and information that you can use when troubleshooting your system. You can use this information to help you identify and resolve problems which may occur during installation and configuration, or during the collection, transfer, analysis and population of data.

Before you begin

Before you begin troubleshooting, there is some basic information you might need to know

  1. Java Home - The Java version used by the KM must be 1.7.00 or later
  2. PATROL user must have permissions to run the Java declared in the Java Home field. Add admin OS credentials to to run the declared Java version. 

    The PATROL user credentials are used, if the local username and password is not provided. The credentials used to run the Java collector must have permissions to run the Java declared in the Java Home field.

  3. You must have access to the following by using the ping command or by using the http URL connection. For example: http://<hostname>:50070. Hadoop KM connects to the URL of each machine to get its own metrics. 
    1. Hadoop remote machine
    2. Hadoop components like ResourceManager, DataNode, NodeManager, JobTracker, TaskTracker, SecondaryNameNode, NameNode and JobHistory
    1. Any other machine that you expect to see in the PATROL tree

Enabling debug

Follow the steps to enable PSL and Java logging:

In BPPM,

On the Add Monitor Types dialog, with the Monitoring Profile set to Hadoop, and the Monitor Type set to Hadoop, in the KM Administration group, select the Logging option.

logging.png

In PATROL Console,

  1. Right-click the Hadoop Environment node and choose KM Commands > KM Administration > Logging.
  2. Click Enable to start debugging and Disable to stop debugging.

logging_patrol.png

Use of LogCollector tool

The LogCollector tool automatically collects all the KM logs and diag reports required to identify the issue. Browse to the <<Patrol 3 home>>/Hadoop to locate the LogCollector tool. 

Follow the steps before using the LogCollector tool:

  1. Enable logging as described in the procedure in Enabling debug topic.
  2. Reproduce the issue
  3. Set the PATROL_HOME variable
  4. Run the LogCollector tool

LogCollector collects the following:

  • Diag logs (if any)
  • DiscoveryXML (if any)
  • All JAVA logs
  • Hadoop PSL logs and PATROL Agent logs from <<Patrol 3 home>>/Hadoop/log
  • After the successful execution of LogCollector, a zip file hadoopLogsFile_<dateand time>.zip will be created with all the required data for the BMC support for troubleshooting the issue.

Frequently asked questions

Why am I unable to see data for some parameters? Can I request a new set of parameters?

You can generate a diagnostics report that identifies the missing collection data.

Right-click the Hadoop Environment node and select Knowledge Modules (KM) Commands > KM Administration > HADOOP Diagnostics Report. This KM command will generate a xml file that contains the JMX result for each Hadoop component, like NameNode, ResourceManager, DataNode or any other node that exists in the PATROL environment. The xml file generated is located at <<Patrol 3 home>>/Hadoop/report/diag, and is named as <<environment name>>_DiagnosticReport.xml.

This process might take some time.

How can I stop monitoring of a configured Hadoop environment without deleting it from the pconfig or without using Blackout functionality?

  1. Set the /BTQ/HADOOP/<<environment name>>/disableInstance pconfig environment variable to 1 or true to stop the monitoring.
  2. Set the /BTQ/HADOOP/<<environment name>>/disableInstance pconfig environment variable to 0 or false to restart the monitoring.

scenario 3.png

How do I change the Java collection interval?

By default, Hadoop Availability data is collected every 1 minute and Hadoop Data is collected every 5 minutes. To change the default collection time, follow the steps:

From CMA:

Update the Availability collection time (min) and the Data collection time (min) field to set a new collection time.

scenario 4.png

From PATROL:

In the Hadoop environment variable,

  • For Availability, update the <cycle in sec> value in /BTQ/HADOOP/<<environment name>>/collectors/availabilityCollector/Cycle" = {REPLACE =“<cycle in sec>>”} pconfig variable
  • For Data, update the <cycle in sec> value in /BTQ/HADOOP/<<environment name>>/collectors/dataCollector/Cycle" = {REPLACE =“<cycle in sec>>”} pconfig variable

What should I do if the Java collector process consumes high memory?

When monitoring a large scale Hadoop server environment, the Java collector process might consume high memory. 

  1. Right-click the Hadoop Environment node
  2. Select Knowledge Modules (KM) Commands > KM Administration > Define JVM Arguments
  3. Add the argument  -Xmx512m
  4. Click Accept

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*