
After logging on to the Mesos web UI, the users can view tasks in the summary page as well as resources
reserved for that particular user.
crayadm
and
root
are global Mesos users that can view all the running
frameworks and resource usage.
5.3
Use mrun to Retrieve Information About Marathon and Mesos
Frameworks
Cray has developed the
mrun
command for launching applications.
mrun
enables running parallel jobs on Urika-
GX using resources managed by Mesos/Marathon. In addition, this command enables viewing the currently active
Mesos Frameworks and Marathon applications and enables specifying how
mrun
should redirect STDIN. It
provides extensive details on running Marathon applications and also enables cancelling/stopping currently active
Marathon applications.
The Cray Graph Engine (CGE) uses
mrun
to launch jobs under the Marathon framework on the Urika
®
-GX
system.
CAUTION: The
mrun
command cannot be executed within a tenant VM or while the system is operating
in the secure service mode. Both the
munge
and
ncmd
system services must be running for
mrun
/CGE to
work. If either service is stopped or disabled,
mrun
will no longer be able to function
The
mrun
command needs to be executed from a login node. Some examples of using
mrun
are listed below:
Launch a job with
mrun
$
mrun /bin/date
Wed Aug 10 13:31:51 CDT 2016
Display information about frameworks, applications and resources
Use the
--info
option of the
mrun
command to retrieve a quick snapshot view of Mesos
frameworks, Marathon applications, and available compute resources.
$
mrun --info
Active Frameworks:
IBM Spark Shell : Nodes[10] CPUs[ 240] : User[builder]
Jupyter Notebook : Nodes[ 0] CPUs[ 0] : User[urika-user]
marathon : Nodes[20] CPUs[ 480] : User[root]
Active Marathon Jobs:
/mrun/cge/
user
.
dbport
-2016-133-03-50-28.235572
: Nodes[20] CPUs[320/480] : user:
user
cmd:cge-
server
Available Resources:
: Nodes[14] CPUs[336] idle nid000[00-13]
: Nodes[30] CPUs[480] busy nid000[14-29,32-45]
: Nodes[ 2] CPUs[???] down nid000[30-31]
In the example output above, notice the
CPUs[320/480]
indicates that while the user only
specified
mrun --ntasks-per-node=16 -N 20
(meaning the application is running on 320
CPUs),
mrun
intends ALL applications to have exclusive access to each node it is running on,
Resource Management
S3016
129