background image

3-4

Cisco SFS InfiniBand Host Drivers User Guide for Linux

OL-12309-01

Chapter 3      IP over IB Protocol

Subinterfaces

          NOARP  MTU:1480  Metric:1

          RX packets:0 errors:0 dropped:0 overruns:0 frame:0

          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:0

          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

Verify that you see the ib0.8002 output.

Step 6

Configure the new interface just as you would the parent interface. (See the 

“Manually Configuring 

IPoIB for Default IB Partition” section on page 3-2

.)

The following example shows how to configure the new interface:

host1# 

ifconfig ib0.8002 192.168.12.1 netmask 255.255.255.0

Removing a Subinterface Associated with a Specific IB Partition

To remove a subinterface, perform the following steps: 

Step 1

Take the subinterface offline. You cannot remove a subinterface until you bring it down.

The following example shows how to take the subinterface offline:

 

host1# 

ifconfig ib0.8002 down

Step 2

Remove the value of the partition key to the file as root user.

The following example shows how to remove the partition 80:02 from the primary interface ib0: 

host1# 

/usr/local/topspin/sbin/ipoibcfg del ib0 80:02

Step 3

(Optional) Verify that the subinterface no longer appears in the interface list by entering the 

ifconfig -a

 

command.

The following example shows how to verify that the subinterface no longer appears in the interface list:

host1# 

ifconfig -a

eth0      Link encap:Ethernet  HWaddr 00:30:48:20:D5:D1

          inet addr:172.29.237.206  Bcast:172.29.239.255  Mask:255.255.252.0

          inet6 addr: fe80::230:48ff:fe20:d5d1/64 Scope:Link

          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

          RX packets:9091465 errors:0 dropped:0 overruns:0 frame:0

          TX packets:505050 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:1000

          RX bytes:1517373743 (1.4 GiB)  TX bytes:39074067 (37.2 MiB)

          Base address:0x3040 Memory:dd420000-dd440000

 

ib0       Link encap:Ethernet  HWaddr F8:79:D1:23:9A:2B

          inet addr:192.168.0.1 Bcast:192.168.0.255  Mask:255.255.255.0

          inet6 addr: fe80::9879:d1ff:fe20:f4e7/64 Scope:Link

          UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1

          RX packets:0 errors:0 dropped:0 overruns:0 frame:0

          TX packets:0 errors:0 dropped:9 overruns:0 carrier:0

          collisions:0 txqueuelen:1024

          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

 

ib0.8002  Link encap:Ethernet  HWaddr 00:00:00:00:00:00

          BROADCAST MULTICAST  MTU:2044  Metric:1

          RX packets:0 errors:0 dropped:0 overruns:0 frame:0

Summary of Contents for GEM318P

Page 1: ...ms Inc 170 West Tasman Drive San Jose CA 95134 1706 USA http www cisco com Tel 408 526 4000 800 553 NETS 6387 Fax 408 527 0883 Cisco SFS InfiniBand Host Drivers User Guide for Linux Release 3 2 0 June 2007 Text Part Number OL 12309 01 ...

Page 2: ... DATA ARISING OUT OF THE USE OR INABILITY TO USE THIS MANUAL EVEN IF CISCO OR ITS SUPPLIERS HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES Any Internet Protocol IP addresses used in this document are not intended to be actual addresses Any examples command display output and figures included in the document are shown for illustrative purposes only Any use of actual IP addresses in illustrati...

Page 3: ...bout Host Drivers 1 1 Introduction 1 1 Architecture 1 2 Supported Protocols 1 3 IPoIB 1 3 SRP 1 3 SDP 1 3 Supported APIs 1 4 MVAPICH MPI 1 4 uDAPL 1 4 Intel MPI 1 4 HP MPI 1 4 HCA Utilities and Diagnostics 1 4 C H A P T E R 2 Installing Host Drivers 2 1 Introduction 2 1 Contents of ISO Image 2 2 Installing Host Drivers from an ISO Image 2 2 Uninstalling Host Drivers from an ISO Image 2 3 C H A P T...

Page 4: ...figuring ITLs with Element Manager while No Global Policy Restrictions Apply 4 2 Configuring ITLs with Element Manager while Global Policy Restrictions Apply 4 4 Configuring SRP Host 4 6 Verifying SRP 4 7 Verifying SRP Functionality 4 7 Verifying with Element Manager 4 8 C H A P T E R 5 Sockets Direct Protocol 5 1 Introduction 5 1 Configuring IPoIB Interfaces 5 1 Converting Sockets Based Applicati...

Page 5: ...ariables in the Users Shell Startup Files 7 6 Editing Environment Variables Manually 7 7 MPI Bandwidth Test Performance 7 7 MPI Latency Test Performance 7 8 Intel MPI Benchmarks IMB Test Performance 7 9 Compiling MPI Programs 7 12 C H A P T E R 8 HCA Utilities and Diagnostics 8 1 Introduction 8 1 hca_self_test Utility 8 1 tvflash Utility 8 3 Viewing Card Type and Firmware Version 8 3 Upgrading Fir...

Page 6: ...Contents vi Cisco SFS InfiniBand Host Drivers User Guide for Linux OL 12309 01 ...

Page 7: ...iguring and managing host drivers and host card adapters This administrator should have experience administering similar networking or storage equipment Organization This publication is organized as follows Chapter Title Description Chapter 1 About Host Drivers Describes the Cisco commercial host driver Chapter 2 Installing Host Drivers Describes the installation of host drivers Chapter 3 IP over ...

Page 8: ...in square brackets are optional x y z Alternative keywords are grouped in braces and separated by vertical bars Braces can also be used to group keywords and or arguments for example interface interface type x y z Optional alternative keywords are grouped in brackets and separated by vertical bars string A nonquoted set of characters Do not use quotation marks around the string or the string will ...

Page 9: ...on Guide Release Notes for Linux Host Drivers Release 3 2 0 Release Notes for Cisco OFED Release 1 1 Cisco OpenFabrics Enterprise Distribution InfiniBand Host Drivers User Guide for Linux Cisco SFS Product Family Element Manager User Guide Cisco SFS InfiniBand Fibre Channel Gateway User Guide Obtaining Documentation Obtaining Support and Security Guidelines For information on obtaining documentati...

Page 10: ...x Cisco SFS InfiniBand Host Drivers User Guide for Linux OL 12309 01 Preface Obtaining Documentation Obtaining Support and Security Guidelines ...

Page 11: ...s high performance 10 Gbps and 20 Gbps IB connectivity to PCI X and PCI Express based servers As an integral part of the Cisco SFS solution the Cisco IB HCA enables you to create a unified fabric for consolidating clustering networking and storage communications After you physically install the HCA in the server install the drivers to run IB capable protocols HCAs support the following protocols i...

Page 12: ...ed Protocols and API Architecture 180411 InfiniBand HCA Hardware Specific Driver Connection Manager InfiniBand Verbs API SA Client Connection Manager Abstraction CMA User Level Verbs API SRP uDAPL User Level MAD API Diag Tools Hardware Provider Mid Layer Upper Layer Protocol User APIs Kernel Space User Space Application Level SMA SDP IPoIB Block Storage Access IP Based App Access SCSI RDMA Protoco...

Page 13: ...orage devices and IB attached storage devices SRP requires an SFS with a Fibre Channel gateway to connect the host to Fibre Channel storage In conjunction with an SFS SRP disguises IB attached hosts as Fibre Channel attached hosts The topology transparency feature lets Fibre Channel storage communicate seamlessly with IB attached hosts known as SRP hosts For configuration instructions see Chapter ...

Page 14: ...et of user level APIs for all RDMA capable transports The uDAPL mission is to define a transport independent and platform standard set of APIs that exploits RDMA capabilities such as those present in IB For more information see Chapter 6 uDAPL Intel MPI Cisco tests and supports the SFS IB host drivers with Intel MPI The Intel MPI implementation is available for separate purchase from Intel For mor...

Page 15: ...n ISO image The ISO image contains the binary RPMs for selected Linux distributions The Cisco Linux IB drivers distribution contains an installation script called tsinstall The install script performs the necessary steps to accomplish the following Discover the currently installed kernel Uninstall any IB stacks that are part of the standard operating system distribution Install the Cisco binary RP...

Page 16: ...The following example shows that the installed HCA is viable host1 lspci v grep Mellanox 06 01 0 PCI bridge Mellanox Technologies MT23108 PCI Bridge rev a0 prog if 00 Normal decode 07 00 0 InfiniBand Mellanox Technologies MT23108 InfiniHost rev a0 Subsystem Mellanox Technologies MT23108 InfiniHost Step 2 Download an ISO image and copy it to your network You can download an ISO image from http www ...

Page 17: ...t Number of HCAs Detected 1 PCI Device Check PASS Kernel Arch x86_64 Host Driver Version rhel4 2 6 9 34 ELsmp 3 2 0 136 Host Driver RPM Check PASS HCA Type of HCA 0 LionMini HCA Firmware on HCA 0 v5 2 000 build 3 2 0 136 HCA LionMini A0 HCA Firmware Check on HCA 0 PASS Host Driver Initialization PASS Number of HCA Ports Active 2 Port State of Port 0 on HCA 0 UP 4X Port State of Port 1 on HCA 0 UP ...

Page 18: ...2 4 Cisco SFS InfiniBand Host Drivers User Guide for Linux OL 12309 01 Chapter 2 Installing Host Drivers Uninstalling Host Drivers from an ISO Image ...

Page 19: ...out the significance of prompts used in the examples in this chapter Introduction Configuring IPoIB requires that you follow similar steps to the steps used for configuring IP on an Ethernet network When you configure IPoIB you assign an IP address and a subnet mask to each HCA port The first HCA port on the first HCA in the host is the ib0 interface the second port is ib1 and so on Note To enable...

Page 20: ...ate port identifier ib argument The following example shows how to verify the configuration host1 ifconfig ib0 ib0 Link encap Ethernet HWaddr F8 79 D1 23 9A 2B inet addr 192 168 0 1 Bcast 192 168 0 255 Mask 255 255 255 0 inet6 addr fe80 9879 d1ff fe20 f4e7 64 Scope Link UP BROADCAST RUNNING MULTICAST MTU 2044 Metric 1 RX packets 0 errors 0 dropped 0 overruns 0 frame 0 TX packets 0 errors 0 dropped...

Page 21: ...0 48 20 D5 D1 inet addr 172 29 237 206 Bcast 172 29 239 255 Mask 255 255 252 0 inet6 addr fe80 230 48ff fe20 d5d1 64 Scope Link UP BROADCAST RUNNING MULTICAST MTU 1500 Metric 1 RX packets 9091465 errors 0 dropped 0 overruns 0 frame 0 TX packets 505050 errors 0 dropped 0 overruns 0 carrier 0 collisions 0 txqueuelen 1000 RX bytes 1517373743 1 4 GiB TX bytes 39074067 37 2 MiB Base address 0x3040 Memo...

Page 22: ...rface ib0 host1 usr local topspin sbin ipoibcfg del ib0 80 02 Step 3 Optional Verify that the subinterface no longer appears in the interface list by entering the ifconfig a command The following example shows how to verify that the subinterface no longer appears in the interface list host1 ifconfig a eth0 Link encap Ethernet HWaddr 00 30 48 20 D5 D1 inet addr 172 29 237 206 Bcast 172 29 239 255 M...

Page 23: ...IPoIB functionality perform the following steps Step 1 Log in to your hosts Step 2 Verify the IPoIB functionality by using the ifconfig command The following example shows how two IB nodes are used to verify IPoIB functionality In the following example IB node 1 is at 192 168 0 1 and IB node 2 is at 192 168 0 2 host1 ifconfig ib0 192 168 0 1 netmask 255 255 252 0 host2 ifconfig ib0 192 168 0 2 net...

Page 24: ...s the Bandwidth test The following example shows how to run the Netperf client which starts the Bandwidth test by default host2 netperf H 192 168 0 1 c C m 65536 TCP STREAM TEST from 0 0 0 0 0 0 0 0 port 0 AF_INET to 192 168 0 1 192 168 0 1 port 0 AF_INET Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local re...

Page 25: ...s secs per sec S S us Tr us Tr 16384 87380 1 1 10 00 17228 96 12 98 12 30 30 146 28 552 16384 87380 The following list describes parameters for the netperf command The notable performance values in the example above are as follows Client CPU utilization is 12 98 percent of client CPU Server CPU utilization is 12 30 percent of server CPU Latency is 29 02 microseconds Latency is calculated as follow...

Page 26: ... IPADDR 192 168 0 1 NETMASK 255 255 255 0 STARTMODE auto EOF IPoIB High Availability This section describes IPoIB high availability IPoIB supports active passive port failover high availability between two or more ports When you enable the high availability feature the ports on the HCA for example ib0 and ib1 merge into one virtual port If you configure high availability between the ports on the H...

Page 27: ... interface no longer appears as it is merged with ib0 Step 6 Enable the interface by entering the ifconfig command with the appropriate port identifier ib argument and the up keyword The following example shows how to enable the interface with the ifconfig command host1 ifconfig ib0 up Step 7 Assign an IP address to the merged port just as you would assign an IP address to a standard interface Unm...

Page 28: ...le shows how to display the available interfaces host1 usr local topspin sbin ipoibcfg list ib0 P_Key 0xffff SL 255 Ports InfiniHost0 1 Active InfiniHost0 1 ib1 P_Key 0xffff SL 255 Ports InfiniHost0 2 Active InfiniHost0 2 Step 4 Enable the interfaces by entering the ifconfig command with the appropriate IB interface argument and the up argument The following example shows how to enable the interfa...

Page 29: ...ncy feature enables Fibre Channel storage to communicate seamlessly with IB attached hosts called SRP hosts To connect an IB attached SRP host to a SAN cable your SRP host to an IB fabric that includes an SFS with a Fibre Channel gateway or IB attached storage Log in to the SFS to configure the Fibre Channel connection between the SAN and the SRP host and then log in to the host and configure the ...

Page 30: ...LI or the Element Manager GUI If you restricted port and LUN access when you configured global attributes proceed to the Configuring ITLs with Element Manager while Global Policy Restrictions Apply section on page 4 4 If you have not configured access perform the steps as appropriate in Configuring ITLs with Element Manager while No Global Policy Restrictions Apply section on page 4 2 or in Config...

Page 31: ...Define New SRP Host window opens Note If your host includes multiple HCAs you must configure each individual HCA as an initiator When you configure one HCA in a host other HCAs in the host are not automatically configured Step 7 Choose a GUID from the Host GUID drop down menu in the Define New SRP Host window The menu displays the GUIDs of all connected hosts that you have not yet configured as in...

Page 32: ...uble click the Fibre Channel gateway card that you want to bring up The Fibre Channel Card window opens c Click the Up radio button in the Enable Disable Card field and then click Apply d Optional Repeat this process for additional gateways The Fibre Channel gateway automatically discovers all attached storage Note Discovered LUs remain gray inactive until an SRP host connects to them Once a host ...

Page 33: ...rts The pressed port numbers represent accessible ports Step 14 Click the port s to which the SAN connects to grant the initiator access to the target through those ports and then click OK Step 15 Click the Apply button in the IT Properties window and then close the window Step 16 Click the LUN Access tab in the host display and then click Discover LUNs The targets and associated LUNs that your Fi...

Page 34: ...s how to check for SCSI disk host1 cat proc scsi scsi Attached devices Host scsi0 Channel 00 Id 01 Lun 00 Vendor SEAGATE Model ST373307LC Rev 0006 Type Direct Access ANSI SCSI revision 03 Host scsi0 Channel 00 Id 06 Lun 00 Vendor SDR Model GEM318P Rev 1 Type Processor ANSI SCSI revision 02 The above example shows one local Seagate Model ST373307LC SCSI disk Step 2 Reload the SRP host driver after ...

Page 35: ...or this disk is set to 8200 There is nothing wrong with that but this is larger than 1024 and could in certain setups cause problems with 1 software that runs at boot time e g old versions of LILO 2 booting and partitioning software from other OSs e g DOS FDISK OS 2 FDISK Warning invalid flag 0x0000 of partition table 4 will be corrected by w rite Command m for help p Disk dev sdb 8598 MB 85988474...

Page 36: ... 89 data software sjc filer25b cisco com qadata 1353442040 996454024 356988016 74 qadata dev sdb1 4031664 40800 3786068 2 mnt Step 4 Write some data to the file system The following example shows how to write some data to the file system host1 dd if dev zero of mnt dd test count 1000 1000 0 records in 1000 0 records out host1 ls l mnt dd test rw r r 1 root root 512000 Jul 25 13 25 mnt dd test Veri...

Page 37: ...SDP is an IB specific upper layer protocol It defines a standard wire protocol to support stream sockets networking over IB SDP enables sockets based applications to take advantage of the enhanced performance features provided by IB and achieves lower latency and higher bandwidth than IPoIB running sockets based applications It provides a high performance zero copy data transfer protocol for strea...

Page 38: ...Automatic Conversion Type This section describes automatic conversion type Use a text editor to open the libsdp configuration file located in usr local topspin etc libsdp conf This file defines when to automatically use SDP instead of TCP You may edit this file to specify connection overrides Use the environment variable LIBSDP_CONFIG_FILE to specify an alternate configuration file The automatic c...

Page 39: ...ent The match directive enables the user to specify when libsdp replaces AF_INET SOCK_STREAM sockets with AF_SDP SOCK_STREAM sockets Each match directive specifies a group for which all expressions must evaluate as true logical and The four expressions are as follows destination ip_port listen ip_port shared ip_port program program_name The syntax description for the match statement is as follows ...

Page 40: ... SDP performance perform the following steps Step 1 Download Netperf from the following URL http www netperf org netperf NetperfPage html Step 2 Follow the instructions at http www netperf org netperf NetperfPage html to compile Netperf Step 3 Create a libsdp configuration file host1 cat HOME libsdp conf EOF match destination match listen EOF Step 4 Run the Netperf server which forces SDP to be us...

Page 41: ... Latency test with SDP host2 LD_PRELOAD libsdp so LIBSDP_CONFIG_FILE HOME libsdp conf netperf H 192 168 0 1 c C t TCP_RR r 1 1 TCP REQUEST RESPONSE TEST from 0 0 0 0 0 0 0 0 port 0 AF_INET to 192 168 0 1 192 168 0 1 port 0 AF_INET Local Remote Socket Size Request Resp Elapsed Trans CPU CPU S dem S dem Send Recv Size Size Time Rate local remote local remote bytes bytes bytes bytes secs per sec S S ...

Page 42: ...following example shows how to create the libsdp configuration file host1 echo match shared HOME both conf Step 2 Ensure that the Netperf server is not running already and then start the Netperf server The following example stops the Netperf server if it is already running and then starts the server host1 pkill netserver host1 LD_PRELOAD libsdp so LIBSDP_CONFIG_FILE HOME both conf netserver Starti...

Page 43: ...t Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs 10 6bits s S S us KB us KB 87380 16384 65536 10 00 2701 06 46 93 48 73 5 694 5 912 Note You must specify the IPoIB IP address when running the Netperf client The following list describes parameters for the netperf command The notable performance values in the example above a...

Page 44: ...5 8 Cisco SFS InfiniBand Host Drivers User Guide for Linux OL 12309 01 Chapter 5 Sockets Direct Protocol Netperf Server with IPoIB and SDP ...

Page 45: ... all RDMA capable transports uDAPL also defines a transport independent and platform standard set of APIs that takes advantage of RDMA capabilities such as those present in IB To obtain uDAPL install the drivers No additional configuration is required to use uDAPL For additional details about uDAPL go to the following URL http www datcollaborative org uDAPL Test Performance This section describes ...

Page 46: ...name of the device 262144 is the size in bytes of the RDMA WRITE 500 is the number of RDMAs to perform for the test 100 is the number of RDMAs to perform before waiting for completions The server starts and then waits for the client to start Step 2 Start the Throughput test on the client The syntax for the client is as follows usr local topspin bin thru_client x device_name server_IP_address RDMA_...

Page 47: ...ec Sent 7759 462 Mb in 1 0 seconds throughput 7742 283 Mb sec Sent 7759 462 Mb in 1 0 seconds throughput 7742 483 Mb sec total secs 13 throughput 7742 Mb sec Received an event on ep_handle 0x2a95f8a300 Context 29a The notable performance result in the example is Throughput as 7 7 gigabits per second uDAPL Latency Test Performance The uDAPL Latency test measures half of the round trip latency for u...

Page 48: ...esults Server Name 192 168 0 1 Server Net Address 192 168 0 1 Connection Event Received the correct event Latency 6 5 us Latency 6 5 us Latency 6 5 us Latency 6 5 us Latency 6 5 us Latency 6 5 us Latency 6 5 us Latency 6 5 us Latency 6 5 us Latency 6 5 us Average latency 6 5 us Connection Event Received the correct event closing IA Exiting program The notable performance value in the example above...

Page 49: ... passing program MPI allows the coordination of a program running as multiple processes in a distributed memory environment This chapter includes setup and configuration information for the MVAPICH MPI MVAPICH MPI supports both the GNU and Intel compiler suites Each of these compiler suites support the C C Fortran77 and Fortran90 programming languages For additional details about MPI go to the fol...

Page 50: ...log in to remote nodes without manually entering a login name password or passphrase during the MPI run The example in this section distinguishes between passwords and passphrases Passwords are associated with usernames and are normally used to log in and or authenticate a user on a node SSH can be configured to log in to remote nodes by using public key encryption to establish credentials on thos...

Page 51: ... Enter key to store the key in the default directory The following example shows how to store the key in the default directory Enter file in which to save the key home username ssh id_rsa Created directory home username ssh Enter passphrase empty for no passphrase Note If you have used SSH before you may not see the created directory message as displayed in the example above Step 4 Press the Retur...

Page 52: ...host1 you may see a message similar to the one below The authenticity of host host1 10 0 0 1 can t be established RSA key fingerprint is 6b 47 70 fb 6c c1 a1 90 b9 30 93 75 c3 ee a9 53 Are you sure you want to continue connecting yes no If you see this prompt type yes and press Enter You may then see a message similar to this Warning Permanently added host1 RSA to the list of known hosts You will ...

Page 53: ...onnection You should be able to log in to the remote node without being prompted for a username password or passphrase The following example shows how to test your SSH connection host1 ssh host2 hostname host2 host1 Step 11 Repeat Step 8 through Step 10 for each host that you want to use with MPI Note Clear all the authenticity messages before continuing to repeat the steps Editing Environment Var...

Page 54: ...s users to have their own preference of which MPI to use but it requires that users manually modify their own shell startup files Individual users can use this method to override the system default MPI implementation selection All shells have some type of script file that is executed at login time to set environment variables such as PATH and LD_LIBRARY_PATH and perform other environmental setup t...

Page 55: ...lation is functioning properly This procedure requires that you log in to remote nodes without a login name and password and that the MPI bin directory is in your PATH To test MPI bandwidth perform the following steps Step 1 Log in to your local host Step 2 Create a text file containing the names of two hosts on which to run the test These hostnames are likely to be unique to your cluster The firs...

Page 56: ...your installation are functioning properly This procedure requires your ability to log in to remote nodes without a login name and password and it requires that the MPI directory is in your PATH To test MPI latency perform the following steps Step 1 Log in to your local host Step 2 Create a text file containing the names of two hosts on which to run the test These hostnames are likely to be unique...

Page 57: ...8 2 97 16 2 97 32 3 08 64 3 11 128 3 90 256 4 26 512 4 95 1024 6 07 2048 7 31 4096 9 88 8192 23 35 16384 29 03 32768 41 23 65536 65 07 131072 113 01 262144 209 19 524288 400 72 1048576 780 69 2097152 1540 19 4194304 3072 65 Intel MPI Benchmarks IMB Test Performance This section describes the IMB test performance The IMB test executes a variety of communication patterns across multiple nodes as a s...

Page 58: ..._HOME usr local topspin mpi mpich Step 4 Log in to your local host Create a text file containing the names of all hosts on which to run the test You should include at least two hosts These hostnames are likely to be unique to your cluster The first name should be the name of the host into which you are currently logged The following example shows one way to create a hostfile named hostfile that co...

Page 59: ...of Benchmarks to run PingPong PingPing Sendrecv Exchange Allreduce Reduce Reduce_scatter Allgather Allgatherv Alltoall Bcast Barrier Benchmarking PingPong processes 2 bytes repetitions t usec Mbytes sec 0 1000 2 86 0 00 1 1000 2 86 0 33 2 1000 2 86 0 67 4 1000 2 98 1 28 8 1000 2 96 2 58 16 1000 2 97 5 14 32 1000 3 08 9 91 64 1000 3 17 19 27 128 1000 3 95 30 87 256 1000 4 28 57 03 512 1000 5 03 97 ...

Page 60: ...lowing steps Step 1 Log in to your local host Step 2 Copy the example files to your HOME directory The example files can be copied as follows host1 cp r usr local topspin mpi examples HOME mpi mpich src examples hello The files in the usr local topspin mpi examples directory are sample MPI applications that are provided both as a trivial primer to MPI as well as simple tests to ensure that your MP...

Page 61: ... names as listed in Table 7 2 if you are using the GNU or the PGI compiler Step 5 If the HOME mpi examples directory is not shared across all hosts in the cluster copy the executables to a directory that is shared across all hosts such as to a directory on a network file system Step 6 Run the MPI program The following example shows how to run an MVAPICH MPI C program Hello World host1 mpirun_rsh n...

Page 62: ...7 14 Cisco SFS InfiniBand Host Drivers User Guide for Linux OL 12309 01 Chapter 7 MVAPICH MPI Compiling MPI Programs ...

Page 63: ...e of prompts used in the examples in this chapter hca_self_test Utility This section describes the hca_self_test utility The hca_self_test utility displays basic HCA attributes and provides introductory troubleshooting information To run this utility perform the following steps Step 1 Log in to your host Step 2 Run the hca_self_test command The following example shows how to run the hca_self_test ...

Page 64: ...orrectly as a PCI device Kernel Architecture Kernel architecture on the host Host Driver Version Version of the drivers on the host Host Driver RPM Check Confirms that the RPMs that are installed are compatible with the host operating system HCA Type of HCA 0 Displays the HCA card type HCA Firmware on HCA 0 Firmware version that runs on the HCA HCA Firmware Check on HCA 0 Displays PASS or FAIL Hos...

Page 65: ... perform the following steps Step 1 Log in to your host Step 2 Enter the tvflash command with the i flag The following example shows how to enter the tvflash command with the i flag host1 usr local topspin sbin tvflash i HCA 0 MT25208 Tavor Compat Lion Cub revision A0 Primary image is v4 8 200 build 3 2 0 136 with label HCA LionCub A0 Secondary image is v4 7 400 build 3 2 0 118 with label HCA Lion...

Page 66: ...e h flag The number of the HCA in the host 0 or 1 on hosts that support 2 HCAs The firmware binary file including path The following example shows how to use the tvflash command host1 tvflash h 0 usr local topspin share fw lioncub a0 4 8 200 bin New Node GUID 0005ad020021700c New Port1 GUID 0005ad020021700d New Port2 GUID 0005ad020021700e Programming HCA firmware Flash Image Size 325696 Flashing E...

Page 67: ...IVE sm_lid 0x0003 port_lid 0x0006 port_lmc 0x00 max_mtu 2048 port 2 port_state PORT_ACTIVE sm_lid 0x0003 port_lid 0x000b port_lmc 0x00 max_mtu 2048 There are also several files in proc topspin that contain diagnostic information The following are examples of diagnostic files host1 cat proc topspin core ca1 info name InfiniHost0 provider tavor node GUID 0005 ad00 0005 00f0 ports 2 vendor ID 0x2c9 d...

Page 68: ...8 6 Cisco SFS InfiniBand Host Drivers User Guide for Linux OL 12309 01 Chapter 8 HCA Utilities and Diagnostics Diagnostics ...

Page 69: ...t Channel Adapter IB InfiniBand IPoIB Internet Protocol over InfiniBand ITL Initiator Target LUN LU logical unit LUN logical unit number MPI Message Passing Interface MVAPICH MPI MVAPICH Message Passing Interface OFED OpenFabrics Enterprise Distribution Open MPI Open Message Passing Interface PCU protocol control information RAID Redundant Array of Independent Disks RDMA Remote Direct Memory Acces...

Page 70: ... Acronyms and Abbreviations SSH Secure Shell Protocol TCP Transmission Control Protocol uDAPL User Direct Access Programming Library ULP upper level protocol WWNN world wide node name WWPN world wide port name Table A 1 List of Acronyms and Abbreviations continued Acronym Expansion ...

Page 71: ...re IPoIB 3 2 5 1 ITL 4 2 SRP 4 1 4 6 SSH 7 2 connections host to storage 4 7 conventions document viii conversion type automatic 5 2 explicit source code 5 2 create subinterface 3 3 D distributed memory environment 7 1 document audience vii conventions viii organization vii related ix E Element Manager 4 2 environment variables edit manually 7 7 set system wide 7 6 users shell 7 6 F Fibre Channel ...

Page 72: ...tem log files 2 3 host to storage connections 4 7 I IB HCA 1 1 hosts 4 1 partition 3 2 SDP 1 3 5 1 ifconfig command 3 2 IMB 7 9 InfiniBand See IB InfiniHost 2 2 Initiator Target LUNs See ITLs install host drivers 2 2 Intel compiler 7 1 IPoIB configure 3 2 5 1 description 1 3 functionality 3 5 IP over InfiniBand See IPoIB ISO image 2 2 contents 2 2 install 2 2 uninstall 2 3 ITLs 4 1 K kernel module...

Page 73: ...erver 1 1 policy LUN masking 4 4 portmask 4 4 portmask policy 4 4 programming languages 7 1 public private key pair 7 3 R RDMA 4 1 performance 6 2 performance test 6 2 RDMA thru_client x 6 2 Red Hat Package Manager See RPM related documentation ix remote direct memory access See RDMA remote node 7 5 remove subinterface 3 4 RPM 2 1 S SAN 4 1 SCP 7 5 SCSI 1 3 4 1 SCSI RDMA Protocol See SRP SDP 1 1 1...

Page 74: ... 7 Bandwidth MPI 7 7 Bandwidth with SDP 5 6 IMB 7 9 Intel MPI Benchmarks See IMB Latency 3 7 Latency MPI 7 8 throughput test uDAPL 6 1 thru_server x 6 2 tvflash utility 8 3 U uninstall host drivers 2 3 upgrade firmware 8 4 upper layer protocol 5 1 utility hca_self_test 8 1 tvflash 8 3 V verify with Element Manager 4 8 view card type 8 3 firmware version 8 3 W worldwide node names See WWNNs worldwi...

Reviews: