Memcheck: a memory error detector
4.9. Debugging MPI Parallel Programs with
Valgrind
Memcheck supports debugging of distributed-memory applications which use the MPI message passing standard.
This support consists of a library of wrapper functions for the
PMPI_*
interface.
When incorporated into the
application’s address space, either by direct linking or by
LD_PRELOAD
, the wrappers intercept calls to
PMPI_Send
,
PMPI_Recv
, etc.
They then use client requests to inform Memcheck of memory state changes caused by the
function being wrapped.
This reduces the number of false positives that Memcheck otherwise typically reports for
MPI applications.
The wrappers also take the opportunity to carefully check size and definedness of buffers passed as arguments to MPI
functions, hence detecting errors such as passing undefined data to
PMPI_Send
, or receiving data into a buffer which
is too small.
Unlike most of the rest of Valgrind, the wrapper library is subject to a BSD-style license, so you can link it into any
code base you like. See the top of
mpi/libmpiwrap.c
for license details.
4.9.1. Building and installing the wrappers
The wrapper library will be built automatically if possible. Valgrind’s configure script will look for a suitable
mpicc
to build it with. This must be the same
mpicc
you use to build the MPI application you want to debug. By default,
Valgrind tries
mpicc
, but you can specify a different one by using the configure-time option
--with-mpicc
.
Currently the wrappers are only buildable with
mpicc
s which are based on GNU GCC or Intel’s C++ Compiler.
Check that the configure script prints a line like this:
checking for usable MPI2-compliant mpicc and mpi.h... yes, mpicc
If it says
...
no
, your
mpicc
has failed to compile and link a test MPI2 program.
If the configure test succeeds, continue in the usual way with
make
and
make install
.
The final install tree
should then contain
libmpiwrap-<platform>.so
.
Compile up a test MPI program (eg, MPI hello-world) and try this:
LD_PRELOAD=$prefix/lib/valgrind/libmpiwrap-<platform>.so
\
mpirun [args] $prefix/bin/valgrind ./hello
You should see something similar to the following
valgrind MPI wrappers 31901: Active for pid 31901
valgrind MPI wrappers 31901: Try MPIWRAP_DEBUG=help for possible options
repeated for every process in the group. If you do not see these, there is an build/installation problem of some kind.
The MPI functions to be wrapped are assumed to be in an ELF shared object with soname matching
libmpi.so*
.
This is known to be correct at least for Open MPI and Quadrics MPI, and can easily be changed if required.
72