7
MPI
The message passing interface (MPI) protocol is a library of calls used by applications in a parallel
computing environment to communicate between nodes. MPI calls are optimized for performance in a
compute cluster that takes advantage of high-bandwidth and low-latency interconnects. In parallel
computing environments, code is executed across multiple nodes simultaneously. MPI facilitates the
communication and synchronization among these jobs across the entire cluster.
To take advantage of the features of MPI, an application must be written and compiled to include the
libraries from the particular MPI implementation used. Several implementations of MPI are on the
market:
HP-MPI
Intel MPI
Publicly available versions such as MVAPICH2 and Open MPI
MPI has become the de-facto IB ULP standard. In particular, HP-MPI has been accepted by more
independent software vendors (ISVs) than any other commercial MPI. By using shared libraries,
applications built on HP-MPI can transparently select interconnects that significantly reduce the effort
for applications to support various popular interconnect technologies. HP-MPI is supported on HP-UX,
Linux, True64 UNIX, and Microsoft Windows Compute Cluster Server 2003.
IPoIB
Internet Protocol over InfiniBand (IPoIB) allows the use of TCP/IP or UDP/IP-based applications
between nodes connected to an InfiniBand fabric. IPoIB supports IPv4 or IPv6 protocols and
addressing schemes. An InfiniBand HCA is configured through the operating system as a traditional
network adapter that can use all of the standard IP-based applications such as PING, FTP, and
TELNET. IPoIB does not support the RDMA features of InfiniBand. Communication between IB nodes
using IPoIB and Ethernet nodes using IP will require a gateway/router interface.
RDMA-based protocols
DAPL - The Direct Access Programming Library (DAPL) allows low-latency RDMA communications
between nodes. The uDAPL provides user-level access to RDMA functionality on InfiniBand while the
kDAPL provides the kernel-level API. To use RDMA for the data transfers between nodes, applications
must be written with a specific DAPL implementation.
SDP – Sockets Direct Protocol (SDP) is an RDMA protocol that that operates from the kernel.
Applications must be written to take advantage of the SDP interface. SDP is based on the WinSock
Direct Protocol used by Microsoft server operating systems and is suited for connecting databases to
application servers.
SRP – SCSI RDMA Protocol (SRP) is a data movement protocol that encapsulates SCSI commands over
InfiniBand for SAN networking. Operating from the kernel level, SRP allows copying SCSI commands
between systems using RDMA for low-latency communications with storage systems.
iSER – iSCSI Enhanced RDMA (iSER) is a storage standard originally specified on the iWARP RDMA
technology and now officially supported on InfiniBand. The iSER protocol provides iSCSI
manageability to RDMA storage operations.
NFS – The Network File System (NFS) is a storage protocol that has evolved since its inception in the
1980s, undergoing several generations of development while remaining network-independent. With
the development of high-performance I/O such as PCIe and the significant advances in memory
subsystems, NFS over RDMA on InfiniBand offers low-latency performance for transparent file sharing
across different platforms.