dpumanager(8)
dpumanager(8)
dpumanager
— DECmpp Sx data parallel unit (DPU) job manager daemon,
Version 1.1
Syntax
etc/dpumanager [
options... ]
Description
The DECmpp Sx job manager daemon,
dpumanager
, maintains the queue for data
parallel unit (DPU) jobs, determines which job has access to the DPU at any
particular time, and ensures that the DPU state is properly initialized between
jobs. It uses a privileged access mode to the array control unit (ACU) driver in
order to monitor calls to
open(2)
and
close(2)
and certain calls to
ioctl(2)
.
This allows it to determine which processes need access to the DPU.
When started,
dpumanager
puts itself in the background by forking a child process
and exiting the parent. This behavior may be suppressed by using the -nodaemon
option.
Jobs are queued in order of priority and the time when the DPU was requested.
The job manager assigns a queue priority based primarily on the job’s processor
element (PE) memory requirement (refer to
mplimit(1)
). In addition, jobs with
a time limit of one minute or less get a small boost in priority. The priority is an
integer from 0 to 100, where a lower number is a higher priority (refer to
mpq(1)
).
A job is queued in front of the first job that has a lower priority. A job’s priority
increases every time another (higher priority) job skips in front of it, so it is not
possible for small jobs to permanently block execution of a large job.
The first n jobs on the queue are loaded in DPU memory and share the machine
in a round-robin fashion. The number n varies according to job requirements
and memory availability. The maximum value for n is determined by the -jobs
command line option.
The round-robin scheduling involves cooperation between the job manager and
the mppehook or mppeback process. The mppehook/mppeback process informs
the job manager of program requirements. The job manager informs mppehook
/mppeback when a job’s time slice is over. The mppehook/mppeback process
informs the job manager when the job is in a quiescent state and can be safely
swapped out. Jobs that do not run under mppehook or mppeback are not able
to share the DPU. Note that all user jobs normally run under mppehook or
mppeback. They run under mppeback if the user explicitly invokes the DECmpp
Sx Parallel Programming Environment (MPPE) using the
mppe
command and
then invokes the program from within that user interface; otherwise, they run
under mppehook until they fault. If you use the
ps
command, you see the
user’s executable listed twice: the mppehook or mppeback process is the lower
numbered of the two user processes of the job, and this process number is the
same as the number reported by
mpq
. The child process, which is executing the
user’s code, has a higher process number.
The number of jobs in memory is greater than one only when multiple jobs at the
front of the queue are able to share the machine. A job cannot share the machine
if it needs all of memory or it is not running under mppehook or mppeback (user
jobs normally run under one of these two processes). If a job cannot share the
Data Parallel Unit Reference Pages B–65