5. IP over Infiniband (IPoIB)
The OFED stack allows you to run TCP/IP over your infiniband network, allowing you to run non-infiniband aware applications across your network. Several native infiniband applications also use IPoIB for host resolution (eg Lustre and SDP).
5.1 List the network devices
Check that the IBoIP modules is loaded.
#modprobe ib_ipoib
You will now have an "ib" network interface for each of your infiniband cards.
#ifconfig -a
<snip>
ib0 Link encap:UNSPEC HWaddr 80-06-00-48-FE-80-00-00-00-00-00-00-00-00-00-00
BROADCAST MULTICAST MTU:2044 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:256
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
ib1 Link encap:UNSPEC HWaddr 80-06-00-49-FE-80-00-00-00-00-00-00-00-00-00-00
BROADCAST MULTICAST MTU:2044 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:256
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
<snip>
5.2 IP Configuration
You can now configure the ib network devices using /etc/network/interfaces.
auto ib0
iface ib0 inet static
address 172.31.128.50
netmask 255.255.240.0
broadcast 172.31.143.255
Bring the network device up, as normal.
ifup ib0
5.3 Connected vs Unconnected Mode
IPoIB can run over two infiniband transports, Unreliable Datagram (UD) mode or Connected mode (CM). The difference between these two modes are described in:
RFC4392 - IP over InfiniBand (IPoIB) Architecture RFC4391 - Transmission of IP over InfiniBand (IPoIB) (UD mode) RFC4755 - IP over InfiniBand: Connected ModeADDME: Pro/cons of these two methods?
You can switch between these two mode at runtime with:
echo datagram > /sys/class/net/ibX/mode
echo connected > /sys/class/net/ibX/mode
The default is datagram (UD) mode. If you with to use CM then you can add a script to /etc/network/interfaces/if-up.d to automatically set CM mode on your interfaces when they are configured.
5.4 TCP tuning
In order to obtain maximum IPoIB throughput you may need to tweak the MTU and various kernel TCP buffer and window settings. See the details in the ipoib_release_notes.txt document in the ofed-docs package.
5.5 ARP and dual ported cards
If you have a dual ported card with both ports on the same IB subnet, but different IP subnets, you will need to tweak the ARP settings for the IPoIB interfaces. See ipoib_release_notes.txt in the ofed-docs package for a full discussion of this issue.
sysctl -w net.ipv4.conf.ib0.arp_ignore=1
sysctl -w net.ipv4.conf.ib1.arp_ignore=1
No comments:
Post a Comment