
66
Example
Figure 9-4:
HCA Self-Test with Errors on Two Ports
5. To locate further information about an error counter failure, execute
counters
on a specific port.
Example
Figure 9-5:
Example of Error Counter Output
[root@1750]# /usr/local/topspin/sbin/hca_self_test
---- Performing InfiniBand HCA Self Test ----
Number of HCAs Detected ................ 1
PCI Device Check ....................... PASS
Host Driver Version .................... rhel3-2.4.21-4.ELsmp-2.0.0-530
Host Driver RPM Check .................. PASS
HCA Type of HCA #0 ..................... Cougar
HCA Firmware on HCA #0 ................. v3.01.0000
HCA Firmware Check on HCA #0 ........... PASS
Host Driver Initialization ............. PASS
Number of HCA Ports Active ............. 0
Port State of Port #0 on HCA #0 ........ DOWN
Port State of Port #1 on HCA #0 ........ DOWN
Error Counter Check ....................
FAIL
REASON: found errors in the following counters
Errors in /proc/topspin/core/ca1/port1/counters
Symbol error counter: 29
Kernel Syslog Check .................... PASS
[root@1750]# cat /proc/topspin/core/ca1/port1/counters
Symbol error counter: 29
Link error recovery counter: 0
Link downed counter: 1
Port receive errors: 0
Port receive remote physical errors: 0
Port receive switch relay errors: 0
Port transmit discards: 2
Port transmit constrain errors: 0
Port receive constrain errors: 0
Local link integrity errors: 0
Excessive buffer overrun errors: 0
VL15 dropped: 0
Port transmit data: 1133136
Port receive data: 1099008
Port transmit packets: 15738
Port receive packets: 15264