IB0054606-02 A
C-1
C
Integration with a Batch
Queuing System
Most cluster systems use some kind of batch queuing system as an orderly way to
provide users with access to the resources they need to meet their job’s
performance requirements. One task of the cluster administrator is to allow users
to submit MPI jobs through these batch queuing systems.
For Open MPI, there are resources at
that document how to use the
MPI with three batch queuing systems. The links to the Frequently Asked
Questions (FAQs) for each of the three batch queuing system are as follows:
Torque / PBS Pro:
http://www.open-mpi.org/faq/?category=tm
http://www.open-mpi.org/faq/?category=slurm
Bproc:
http://www.open-mpi.org/faq/?category=bproc
In this Appendix there are two sections which deal with process and file clean-up
after batch MPI/PSM jobs have completed:
Clean Termination of MPI Processes
and
Clean-up PSM Shared Memory Files
.
Clean Termination of MPI Processes
The InfiniPath software normally ensures clean termination of all MPI programs
when a job ends, but in some rare circumstances an MPI process may remain
alive, and potentially interfere with future MPI jobs. To avoid this problem, run a
script before and after each batch job that kills all unwanted processes. QLogic
does not provide such a script, but it is useful to know how to find out which
processes on a node are using the QLogic interconnect. The easiest way to do
this is with the
fuser
command, which is normally installed in
/sbin
.
Run these commands as a root user to ensure that all processes are reported.
#
/sbin/fuser -v /dev/ipath
/dev/ipath: 22648m 22651m
In this example, processes 22648 and 22651 are using the QLogic interconnect. It
is also possible to use this command (as a root user):
#
lsof /dev/ipath
Summary of Contents for OFED+ Host
Page 1: ...IB0054606 02 A OFED Host Software Release 1 5 4 User Guide...
Page 14: ...xiv IB0054606 02 A OFED Host Software Release 1 5 4 User Guide...
Page 22: ...1 Introduction Interoperability 1 4 IB0054606 02 A...
Page 96: ...4 Running MPI on QLogic Adapters Debugging MPI Programs 4 24 IB0054606 02 A...
Page 140: ...6 SHMEM Description and Configuration SHMEM Benchmark Programs 6 32 IB0054606 02 A...
Page 148: ...8 Dispersive Routing 8 4 IB0054606 02 A...
Page 164: ...9 gPXE HTTP Boot Setup 9 16 IB0054606 02 A...
Page 176: ...A Benchmark Programs Benchmark 3 Messaging Rate Microbenchmarks A 12 IB0054606 02 A...
Page 202: ...B SRP Configuration OFED SRP Configuration B 26 IB0054606 02 A Notes...
Page 206: ...C Integration with a Batch Queuing System Clean up PSM Shared Memory Files C 4 IB0054606 02 A...
Page 238: ...E ULP Troubleshooting Troubleshooting SRP Issues E 20 IB0054606 02 A...
Page 242: ...F Write Combining Verify Write Combining is Working F 4 IB0054606 02 A Notes...
Page 280: ...G Commands and Files Summary of Configuration Files G 38 IB0054606 02 A...
Page 283: ......