Wednesday, July 8, 2015

IP over Infiniband (IPoIB)

Next Previous Contents

5. IP over Infiniband (IPoIB)

The OFED stack allows you to run TCP/IP over your infiniband network, allowing you to run non-infiniband aware applications across your network. Several native infiniband applications also use IPoIB for host resolution (eg Lustre and SDP).

5.1 List the network devices

Check that the IBoIP modules is loaded.
#modprobe ib_ipoib
You will now have an "ib" network interface for each of your infiniband cards.
#ifconfig -a

<snip>
ib0       Link encap:UNSPEC  HWaddr 80-06-00-48-FE-80-00-00-00-00-00-00-00-00-00-00  
          BROADCAST MULTICAST  MTU:2044  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:256 
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

ib1       Link encap:UNSPEC  HWaddr 80-06-00-49-FE-80-00-00-00-00-00-00-00-00-00-00  
          BROADCAST MULTICAST  MTU:2044  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:256 
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
<snip>

5.2 IP Configuration

You can now configure the ib network devices using /etc/network/interfaces.
auto ib0
iface ib0 inet static
  address 172.31.128.50
  netmask 255.255.240.0
  broadcast 172.31.143.255
Bring the network device up, as normal.
ifup ib0

5.3 Connected vs Unconnected Mode

IPoIB can run over two infiniband transports, Unreliable Datagram (UD) mode or Connected mode (CM). The difference between these two modes are described in:
RFC4392 - IP over InfiniBand (IPoIB) Architecture
RFC4391 - Transmission of IP over InfiniBand (IPoIB) (UD mode)
RFC4755 - IP over InfiniBand: Connected Mode
ADDME: Pro/cons of these two methods?
You can switch between these two mode at runtime with:
  
 echo datagram > /sys/class/net/ibX/mode 
 echo connected > /sys/class/net/ibX/mode
The default is datagram (UD) mode. If you with to use CM then you can add a script to /etc/network/interfaces/if-up.d to automatically set CM mode on your interfaces when they are configured.

5.4 TCP tuning

In order to obtain maximum IPoIB throughput you may need to tweak the MTU and various kernel TCP buffer and window settings. See the details in the ipoib_release_notes.txt document in the ofed-docs package.

5.5 ARP and dual ported cards

If you have a dual ported card with both ports on the same IB subnet, but different IP subnets, you will need to tweak the ARP settings for the IPoIB interfaces. See ipoib_release_notes.txt in the ofed-docs package for a full discussion of this issue.
   sysctl -w net.ipv4.conf.ib0.arp_ignore=1
   sysctl -w net.ipv4.conf.ib1.arp_ignore=1

No comments:

Post a Comment