
Application/Script
Log File Location
●
urika-yam-status
●
urika-yam-
flexdown
●
urika-yam-
flexdown-all
●
urika-yam-flexup
ZooKeeper
/var/log/zookeeper
Hive Metastore
/var/log/hive
HiveServer2
/var/log/hive
HUE
/var/log/hue
Spark Thrift Server
/var/log/spark
Spark Audit Logs
A per-user Spark audit log that details start and stop of applications is located
at
/var/log/spark/k8s/
username
.log
with entries of the following form:
Tue Apr 03 07:54:05 CDT 2018 username spark-test-1522760043061-driver START \
Application Started with 1 driver plus 5.0 executors using 6.0 cores and 496.0GB memory
Tue Apr 03 07:54:38 CDT 2018 username spark-test-1522760043061-driver STOP \
Application Stopped
Tue Apr 3 08:15:51 CDT 2018 username username-shell-159738-5d5b87c8b-82td6 \
START spark-shell Shell Started with 1 driver using 16.0 cores and 60GB memory
Tue Apr 3 08:16:36 CDT 2018 username username-shell-159738-5d5b87c8b-82td6 \
STOP spark-shell Shell Stopped
Wed Apr 04 04:28:45 CDT 2018 username spark-test-1522834122688-driver START \
Application Started with 1 driver plus 7.0 executors using 8.0 cores and 688.0GB memory
Wed Apr 04 04:29:38 CDT 2018 username spark-test-1522834122688-driver STOP \
Application Stopped
Wed Apr 04 04:30:09 CDT 2018 username spark-test-1522834207396-driver START \
Application Started with 1 driver plus 2.0 executors using 3.0 cores and 208.0GB memory
Wed Apr 04 04:30:42 CDT 2018 username spark-test-1522834207396-driver STOP \
Application Stopped
Wed Apr 04 04:32:05 CDT 2018 username spark-test-1522834323513-driver START \
Application Started with 1 driver plus 2.0 executors using 17.0 cores and 208.0GB memory
Wed Apr 04 04:32:28 CDT 2018 username spark-test-1522834323513-driver STOP \
Application Stopped
These log files will be located on whatever node an application is submitted from, typically login1, though maybe
elsewhere depending how the system enables users to access the system.
This log has the general format
date
username
driver-pod-name
action
message
, where:
●
driver-pod-name
is the Kubernetes pod name that is the driver for the application that can be used to link
this information to more detailed information from Kubernetes.
●
action
is either
START
or
STOP
●
message
contains informational content, such as resources requested by the users application.
CAUTION: In the event of a failed or cancelled job (i.e. user executes a Ctrl+C/kill on the job) there may
be no corresponding
STOP
event registered for a job reported as started.
Troubleshooting
S3016
249