
3
–
InfiniBand
®
Cluster Setup and Administration
QLogic Distributed Subnet Administration
3-12
IB0054606-02 A
QLogic Distributed Subnet Administration
As InfiniBand
®
clusters are scaled into the Petaflop range and beyond, a more
efficient method for handling queries to the Fabric Manager is required. One of the
issues is that while the Fabric Manager can configure and operate that many
nodes, under certain conditions it can become overloaded with queries from those
same nodes.
For example, consider an IB fabric consisting of 1,000 nodes, each with 4
processors. When a large MPI job is started across the entire fabric, each process
needs to collect IB path records for every other node in the fabric - and every
single process is going to be querying the subnet manager for these path records
at roughly the same time. This amounts to a total of 3.9 million path queries just to
start the job.
In the past, MPI implementations have side-stepped this problem by hand crafting
path records themselves, but this solution cannot be used if advanced fabric
management techniques such as virtual fabrics and mesh/torus configurations are
being used. In such cases, only the subnet manager itself has enough information
to correctly build a path record between two nodes.
The Distributed Subnet Administration (SA) solves this problem by allowing each
node to locally replicate the path records needed to reach the other nodes on the
fabric. At boot time, each Distributed SA queries the subnet manager for
information about the relevant parts of the fabric, backing off whenever the subnet
manager indicates that it is busy. Once this information is in the Distributed SA's
database, it is ready to answer local path queries from MPI or other IB
applications. If the fabric changes (due to a switch failure or a node being added
or removed from the fabric) the Distributed SA updates the affected portions of the
database. The Distributed SA can be installed and run on any node in the fabric. It
is only needed on nodes running MPI applications.
Applications that use Distributed SA
The QLogic PSM Library has been extended to take advantage of the Distributed
SA. Therefore, all MPIs that use the QLogic PSM library can take advantage of
the Distributed SA. Other applications must be modified specifically to take
advantage of it. For developers writing applications that use the Distributed SA,
refer to the header file
/usr/include/Infiniband/ofedplus_path.h
for
information on using Distributed SA APIs. This file can be found on any node
where the Distributed SA is installed. For further assistance please contact
QLogic Support.
Summary of Contents for OFED+ Host
Page 1: ...IB0054606 02 A OFED Host Software Release 1 5 4 User Guide...
Page 14: ...xiv IB0054606 02 A OFED Host Software Release 1 5 4 User Guide...
Page 22: ...1 Introduction Interoperability 1 4 IB0054606 02 A...
Page 96: ...4 Running MPI on QLogic Adapters Debugging MPI Programs 4 24 IB0054606 02 A...
Page 140: ...6 SHMEM Description and Configuration SHMEM Benchmark Programs 6 32 IB0054606 02 A...
Page 148: ...8 Dispersive Routing 8 4 IB0054606 02 A...
Page 164: ...9 gPXE HTTP Boot Setup 9 16 IB0054606 02 A...
Page 176: ...A Benchmark Programs Benchmark 3 Messaging Rate Microbenchmarks A 12 IB0054606 02 A...
Page 202: ...B SRP Configuration OFED SRP Configuration B 26 IB0054606 02 A Notes...
Page 206: ...C Integration with a Batch Queuing System Clean up PSM Shared Memory Files C 4 IB0054606 02 A...
Page 238: ...E ULP Troubleshooting Troubleshooting SRP Issues E 20 IB0054606 02 A...
Page 242: ...F Write Combining Verify Write Combining is Working F 4 IB0054606 02 A Notes...
Page 280: ...G Commands and Files Summary of Configuration Files G 38 IB0054606 02 A...
Page 283: ......