
adding or removing containers when demand changes, keeping storage consistent with multiple instances of an
application, distributing load between the containers and launching new containers on different machines if
something fails.
On Urika-GX, Kubernetes is used to manage containers in the secure service mode.
NOTE: Currently, Kubernetes supports only Spark images on Urika-GX.
For more information, visit
About the Cray Spark Image
In order to run Spark on Kubernetes, Urika-GX ships with customized Spark images, which are based on the
Spark version used on the system.
3.15.1 Execute Spark Jobs on Kubernetes
Spark jobs run inside containers, which are managed via Kubernetes on the Urika-GX system. This section
provides some examples for executing Spark jobs, retrieving output, and viewing logs etc.
The system needs to be running in the secure mode and the user needs to be logged on a login node to run the
examples shown in this section.
Running a Spark Pi Example Job
The following examples shows how to run a simple Spark Pi job inside a container. It uses
spark.app.name
as the Spark job's name.
$
spark-submit --class org.apache.spark.examples.SparkPi \
--conf spark.app.name=spark-pi \
local:///opt/spark/examples/target/scala-2.11/jars/spark-
examples_2.11-2.2.0-k8s-0.5.0.jar
The path to the JAR file must be relative to the path inside the container, not the path that exists
on the system. Inside the container, the Spark home directory is
/opt/spark
instead
of
/opt/cray/spark2/default
.
The preceding command produces output similar to the following (only a portion of the output is
shown below for brevity):
2018-02-26 16:16:47 INFO HadoopStepsOrchestrator:54 - Hadoop Conf
directory: /etc/hadoop/conf
2018-02-26 16:16:47 INFO HadoopConfBootstrapImpl:54 -
HADOOP_CONF_DIR defined. Mounting Hadoop specific files
2018-02-26 16:16:48 WARN NativeCodeLoader:62 - Unable to load
native-hadoop library for your platform... using builtin-java
classes where applicable
2018-02-26 16:16:48 INFO LoggingPodStatusWatcherImpl:54 - State
changed, new state:
pod name: spark-pi-1519683406605-driver
namespace: username
labels: spark-app-selector ->
spark-027d506894bd4b2ca86692f03f9fab5a, spark-role -> driver
pod uid: bceaf8b2-1b42-11e8-8b39-001e67d33475
creation time: 2018-02-26T22:16:48Z
service account name: spark
volumes: spark-local-dir-0-tmp, hadoop-properties, spark-token-
System Management
S3016
60