
36
VIADEV_MAX_RENDEZVOUS
209715200 Bytes
The maximum amount of
user buffer memory that
will be locked down at any
point in time for zero-copy
IO.
4.7.1.1
VIADEV_NUM_RDMA_BUFFERS
This parameter specifies how many RDMA Write buffers will be setup per connection. The
amount of memory consumed
per process
for RDMA Write buffers is:
2 * VIADEV_NUM_RDMA_BUFFERS * 8360 * RP
Thus, by default, on a 16 node run with 1 process on each node (NC=16, LP=1, NP=16,
RP=15), the memory used for RDMA Write buffers would be 2 * 256 * 8360 * 15 =
64204800 bytes (or 61.23 MB)
4.7.1.2
VIADEV_RQ_DEPTH
This parameter specifies the Receive Queue (RQ) depth for each RDMA Queue Pair (QP).
This depth acts as a flow control mechanism for message passing between MPI processes
across the fabric. The amount of memory consumed
per process
for the RQ is:
VIADEV_RQ_DEPTH * 8360 * RP
So by default, on a 16 node run with 1 process on each node (NC=16, LP=1, NP=16, RP=15),
the memory used for receive buffers would be 240 * 8360 * 15 = 30096000 bytes (28.70
MB)
4.7.1.3
VIADEV_SQ_DEPTH
This parameter specifies the Send Queue (SQ) depth for each RDMA Queue Pair (QP). This
depth acts as a flow control mechanism between the MPI application and the local RNIC
adapter.
Each SQ entry, when doing a particular IO operation (RDMA SEND), will consume one
vbuf to describe the application data being sent. Assuming all SQs on all QPs are full of
these SENDS, the amount of memory consumed
per process
is:
VIADEV_SQ_DEPTH * 8360 * RP
So by default, on a 16 node run with 1 process on each node (NC=16, LP=1, NP=16, RP=15),
the maximum memory consumed for full SQs would be 256 * 8360 * 15 = 32102400 (30.62
MB).
4.7.1.4
VIADEV_MAX_RENDEZVOUS
This parameter limits the amount of application buffers that the MPI RDMA driver will lock
down for doing zero-copy RDMA IO. Zero-copy IO is only done if the IO request from the
application is sufficiently large enough to warrant the overhead of buffer registration and
rendezvous overhead.