Cray Urika-GX Administration Manual Download Page 125

Page: 125 / 272

Mesos tracks the frameworks that have registered with it. If jobs are submitted and there are no resources
available, frameworks will not be dropped. Instead, frameworks will remain registered (unless manually killed by
the user) and will continue to receive updated resource offers from Mesos at regular intervals. As an example,
consider a scenario where there are four Spark jobs running and using all of the resources on the cluster. A user
attempts to submit a job with framework

and framework

is waiting for resources. As each Spark job completes

its execution, it releases the resources and Mesos updates its resource availability. Mesos will continue to give
resource offers to the framework

can chose either to accept or reject resources. If

decides to accept the

resources, it will schedule its tasks. If

rejects the resources, it will remain registered with Mesos and will

continue to receive resource offers from Mesos.

Allocation of resources to Spark by Mesos

Let us say that the

spark-submit

command is executed with parameters

--total-executor-cores 100

--executor-memory 80G

. Each node has 32 cores. Mesos tries to use as few nodes as possible for the 100

cores requested. So in this case it will start Spark executors on 4 nodes (roundup(100 / 32)). Each executor has
been requested to have 80G of memory. Default value for

spark.mesos.executor.memoryOverhead

is 10%

so it allocates 88G to each executor. So in Mesos, it can been seen that

88 * 4 = 352 GB

allocated to the 4

Spark executors. For more information, see the latest Spark documentation at

http://spark.apache.org/docs

Additional points to note:

●

On Urika-GX, the Mesos cluster runs in High Availability mode, with 3 Mesos Masters and Marathon
instances configured with Zookeeper.

●

Unlike Marathon, Mesos does not offer any queue. Urika-GX scripts for flexing clusters and the

mrun

command do not submit their jobs unless they know the resource requirement is satisfied. Once the flex up
request is successful, YARN uses its own queue for all the Hadoop workloads.

●

Spark accepts resource offers with fewer resources than what it requested. For example, if a Spark job wants
1000 cores but only 800 cores are available, Mesos will offer those 800 to the Spark job. Spark will then
choose to accept or reject the offer. If Spark accepts the offer, the job will be executed on the cluster. By
default, Spark will accept any resource offer even if the number of resources in the offer is much less than the
number of nodes the job requested. However, Spark users can control this behavior by specifying a minimum
acceptable resource ratio; for example, they could require that Spark only accept offers of at least 90% of the
requested cores. The configuration parameter that sets the ratio is

spark.scheduler.minRegisteredResourcesRatio

. It can be set on the command line with

--conf

spark.scheduler.minRegisteredResourcesRatio=N

where

is between 0.0 and 1.0.

●

mrun

and flex scripts do not behave the way Spark behaves (as described in the previous bullet).

mrun

accepts two command-line options:

○