
Figure 13. Grafana Spark Metrics Dashboard
Graphs displayed on this dashboard are grouped into the following sections:
○
READ/WRITE: Displays statistics related to the file system statistics of a Spark executor. Results in the
graphs of this section are displayed per node for a particular Spark Job. The Y-axis displays the number
in bytes, whereas the X-axis displays the start/stop time of the task for a particular Spark Job.
This section contains the following graphs:
▪
Executor HDFS Read/Write Per Job (in bytes): Reading and writing from HDFS.
▪
Executor File System Read/Write Per Job (in bytes): Reading and writing from a File System.
○
SPARK JOBS: Displays statistics related to the list of executors per node for a particular Spark Job. The
Y-axis displays the number of tasks and X-axis displays the start/stop time of the task for a particular
Spark Job.
This section contains the following graphs:
▪
Completed Tasks Per Job: The approximate total number of tasks that have completed execution.
▪
Active Tasks Per Job: The approximate number of threads that are actively executing tasks.
▪
Current Pool Size Per Job: The current number of threads in the pool.
▪
Max Pool Size Per Job: The maximum allowed number of threads that have ever simultaneously
been in the pool.
○
DAG Scheduler - Displays statistics related to Spark's Directed Acyclic Graphs.
This section contains the following graphs:
▪
DAG Schedule Stages - This graph displays the following types of DAG stages:
▪
Waiting Stages: Stages with parents to be computed.
▪
Running Stages: Stages currently being run.
▪
Failed Stages: Stages that failed due to fetch failures (as reported by
CompletionEvents
for
FetchFailed
end reasons) and are going to be resubmitted.
System Monitoring
S3016
89