
3
–
InfiniBand
®
Cluster Setup and Administration
Performance Settings and Management Tips
3-30
IB0054606-02 A
High Risk Tuning for Intel Harpertown CPUs
For tuning the Harpertown generation of Intel Xeon CPUs that entails a higher risk
factor, but includes a bandwidth benefit, the following can be applied:
For nodes with Intel Harpertown, Xeon 54xx CPUs, you can add
pcie_caps=0x51
and
pcie_coalesce=1
to the
modprobe.conf
file. For
example:
options ib_qib pcie_caps=0x51 pcie_coalesce=1
If the following problem is reported by syslog, a typical diagnostic can be
performed, which is described in the following paragraphs:
[PCIe Poisoned TLP][Send DMA memory read]
Another potential issue is that after starting
openibd
, messages such as the
following appear on the console:
Message from syslogd@st2019 at Nov 14 16:55:02 ...
kernel:Uhhuh. NMI received for unknown reason 3d on CPU 0
After this happens, you may also see the following message in the syslog:
Mth dd hh:mm:ss st2019 kernel: ib_qib 0000:0a:00.0:
infinipath0:
Fatal Hardware Error, no longer usable, SN AIB1013A43727
These problems typically occur on the first run of an MPI program running over
the PSM transport or immediately after the link becomes active. The adapter will
be unusable after this situation until the system is rebooted. To resolve this issue
try the following solutions in order:
Remove
pcie_coalesce=1
Restart
openibd
and try the MPI program again
Remove both
pcie_caps=0x51
and
pcie_coalesce=1
options from the
ib_qib
line in
modprobe.conf
file and reboot the system
NOTE
Removing both options will technically avoid the problem but can result
in an unnecessary performance decrease. If the system has already
failed with the above diagnostic it will need to be rebooted. Note that in
modprobe.conf file all options for a particular kernel module must be on
the same line and not on repeated options ib_qib lines.
Содержание OFED+ Host
Страница 1: ...IB0054606 02 A OFED Host Software Release 1 5 4 User Guide...
Страница 14: ...xiv IB0054606 02 A OFED Host Software Release 1 5 4 User Guide...
Страница 22: ...1 Introduction Interoperability 1 4 IB0054606 02 A...
Страница 72: ...3 InfiniBand Cluster Setup and Administration Checking Cluster and Software Status 3 48 IB0054606 02 A...
Страница 96: ...4 Running MPI on QLogic Adapters Debugging MPI Programs 4 24 IB0054606 02 A...
Страница 140: ...6 SHMEM Description and Configuration SHMEM Benchmark Programs 6 32 IB0054606 02 A...
Страница 148: ...8 Dispersive Routing 8 4 IB0054606 02 A...
Страница 164: ...9 gPXE HTTP Boot Setup 9 16 IB0054606 02 A...
Страница 176: ...A Benchmark Programs Benchmark 3 Messaging Rate Microbenchmarks A 12 IB0054606 02 A...
Страница 202: ...B SRP Configuration OFED SRP Configuration B 26 IB0054606 02 A Notes...
Страница 206: ...C Integration with a Batch Queuing System Clean up PSM Shared Memory Files C 4 IB0054606 02 A...
Страница 238: ...E ULP Troubleshooting Troubleshooting SRP Issues E 20 IB0054606 02 A...
Страница 242: ...F Write Combining Verify Write Combining is Working F 4 IB0054606 02 A Notes...
Страница 280: ...G Commands and Files Summary of Configuration Files G 38 IB0054606 02 A...
Страница 283: ......