Spark page in the Hadoop view

The Spark page in the Hadoop view enables insights over application execution and corresponding resource utilization for the Hadoop Spark service.

To access the page, in the TrueSight console navigation pane, click Capacity Views > Hadoop > Hadoop View, and in the vSphere Infrastructure page, click the Spark tab.

Related topic

The data in the page is categorized based on different subsystems and grouped under separate tabs. Click a tab to view the metrics for that subsystem:

SubsystemDescription
Summary
Displays a table with key performance indicators for Spark
DetailDisplays Spark charts over time
ApplicationsDisplays a table with application level data about Spark

Metrics per cluster are displayed in a tabular format.

By default, some columns are hidden. You can show or hide columns using the action menuthat is located next to the table title.

 

This topic contains the following sections:

Summary

Resource utilization and Spark performance indicators are displayed for each cluster.

Each row in the table represents a Hadoop cluster and displays the following metrics:


Column

Description

Statistical
Type 
 Metrics

Cluster

Name of the Hadoop cluster

Last valueNAME
DistributionHadoop DistributionLast ValueHADOOP_DISTRIBUTION
Tasks CompletedNumber of tasks completedAverageAPP_SPARK_TASKS_COMPLETED
Tasks FailedNumber of failed completedAverageAPP_SPARK_TASKS_FAILED
Tasks TotalTotal number of executed tasksAverageAPP_SPARK_TASKS_TOTAL
RDD BlocksNumber of Resilient Distributed Dataset processedAverageAPP_SPARK_RDD_BLOCKS
CPU cores usedCpu Core usedAverageAPP_SPARK_CORES
Memory usedMemory usedAverageAPP_SPARK_MEM_USED_BYTES
Input data sizeData input sizeAverageAPP_SPARK_INPUT_BYTES
Shuffle readsSpark shuffle read bytesAverageAPP_SPARK_SHUFFLE_READ_BYTES
Shuffle writesSpark shuffle write bytesAverageAPP_SPARK_SHUFFLE_WRITE_BYTES
GC timeTime spent in Java garbage collectionAverageAPP_SPARK_GC_TIME
Task timeTotal tasks durationSumAPP_SPARK_TASK_TIME

Details

 This page displays Spark information for the selected cluster under the following sections or panels:

Section or PanelDescription
Main page

A graphical analysis of tasks execution and corresponding resource utilization is displayed for Spark service of the selected cluster.

Configuration Details pageConfiguration information about the cluster.

Applications

Application level analysis is presented, with insight both on Spark performance and resource utilization.

Each row in the table represents a Hadoop cluster and displays the following metrics:

 


Column

Description

Statistical
Type 
 Metrics

Cluster

Name of the Hadoop cluster

Last ValueNAME
ApplicationName of the Spark applicationLast ValueNAME
DistributionHadoop DistributionLast ValueHADOOP_DISTRIBUTION
Tasks CompletedNumber of tasks completedAverageBYAPP_SPARK_TASKS_COMPLETED
Tasks FailedNumber of failed completedAverageBYAPP_SPARK_TASKS_FAILED
Tasks TotalTotal number of executed tasksAverageBYAPP_SPARK_TASKS_TOTAL
RDD BlocksNumber of Resilient Distributed Dataset processedAverageBYAPP_SPARK_RDD_BLOCKS
CPU cores usedCpu cores usedAverageBYAPP_SPARK_CORES
Memory usedMemory UsedAverageBYAPP_SPARK_MEM_USED_BYTES
Input data sizeData input sizeAverageAPP_SPARK_INPUT_BYTES
Shuffle readsSpark shuffle read bytesAverageAPP_SPARK_SHUFFLE_READ_BYTES
Shuffle writesSpark shuffle write bytesAverageAPP_SPARK_SHUFFLE_WRITE_BYTES
GC timeTime spent in Java garbage collectionAverageAPP_SPARK_GC_TIME
Task timeTotal tasks durationSumAPP_SPARK_TASK_TIME

 For more information, see Sorting tables in views in the TrueSight console and Using filtering options in views in the TrueSight console.

Was this page helpful? Yes No Submitting... Thank you

Comments