Tuesday, July 7, 2015

14 Monitoring and Controlling the InfiniBand Fabric

This chapter describes how to monitor and control the InfiniBand fabric.
It contains the following topics:

14.1 Monitoring the InfiniBand Fabric

This section contains the following topics:

14.1.1 Identifying All Switches in the Fabric

You can use the ibswitches command to identify the Sun Network QDR InfiniBand Gateway Switches in the InfiniBand fabric in your Exalogic machine. This command displays the Global Unique Identifier (GUID), name, Local Identifier (LID), and LID mask control (LMC) for each switch. The output of the command is a mapping of GUID to LID for switches in the fabric.
On any command-line interface (CLI), run the following command:
# ibswitches
The output is displayed, as in the following example:
Switch : 0x0021283a8389a0a0 ports 36 "Sun DCS 36 QDR switch localhost" enhancedport 0 lid 15 lmc 0
Note:
The actual output for your InfiniBand fabric will differ from that in the example.

14.1.2 Identifying All HCAs in the Fabric

You can use the ibhosts command to display identity information about the host channel adapters (HCAs) in the InfiniBand fabric in a subnet. This command displays the GUID and name for each HCA.
On the command-line interface (CLI), run the following command:
# ibhosts
The output is displayed, as in the following example:
Ca : 0x0003ba000100e388 ports 2 "nsn33-43 HCA-1"
Ca : 0x5080020000911310 ports 1 "nsn32-20 HCA-1"
Ca : 0x50800200008e532c ports 1 "ib-71 HCA-1"
Ca : 0x50800200008e5328 ports 1 "ib-70 HCA-1"
Ca : 0x50800200008296a4 ports 2 "ib-90 HCA-1"
.
.
.
#
Note:
The output in the example is just a portion of the full output and varies for each InfiniBand topology.

14.1.3 Displaying the InfiniBand Fabric Topology

To understand the routing that happens within your InfiniBand fabric, the ibnetdiscover command displays the node-to-node connectivity. The output of the command is dependent upon the size of your fabric. You can also use this command to display the LIDs of HCAs.
On the command-line interface (CLI), enter the following command:
# ibnetdiscover
The output is displayed, as in the following example:
# Topology file: generated on Sat Apr 13 22:28:55 2002
#
# Max of 1 hops discovered
# Initiated from node 0021283a8389a0a0 port 0021283a8389a0a0
vendid=0x2c9
devid=0xbd36
sysimgguid=0x21283a8389a0a3
switchguid=0x21283a8389a0a0(21283a8389a0a0)
Switch   36 "S-0021283a8389a0a0" # "Sun DCS 36 QDR switch localhost" enhanced port 0 lid 15 lmc 0
[23]    "H-0003ba000100e388"[2](3ba000100e38a) # "nsn33-43 HCA-1" lid 14 4xQDR
vendid=0x2c9
devid=0x673c
sysimgguid=0x3ba000100e38b
caguid=0x3ba000100e388
Ca   2 "H-0003ba000100e388" # "nsn33-43 HCA-1"
[2](3ba000100e38a)   "S-0021283a8389a0a0"[23] # lid 14 lmc 0 "Sun DCS 36 QDR switch localhost" lid 15 4xQDR
Note:
The actual output for your InfiniBand fabric will differ from that in the example.

14.1.4 Displaying a Route Through the Fabric

You sometimes need to know the route between two nodes in the InfiniBand fabric. The ibtracert command can provide that information by displaying the GUIDs, ports, and LIDs of the nodes.On the command-line interface (CLI), run the following command:
# ibtracert slid dlid
where slid is the LID of the source node and dlid is the LID of the destination node in the fabric.
The output is displayed, as in the following example:
# ibtracert 15 14
#
From switch {0x0021283a8389a0a0} portnum 0 lid 15-15 "Sun DCS 36 QDR switch localhost"
[23] -> ca port {0x0003ba000100e38a}[2] lid 14-14 "nsn33-43 HCA-1"
To ca {0x0003ba000100e388} portnum 2 lid 14-14 "nsn33-43 HCA-1"
#
For this example:
The route starts at switch with GUID 0x0021283a8389a0a0 and is using port 0. The switch is LID 15 and in the description, the switch host's name is Sun DCS 36 QDR switch localhost. The route enters at port 23 of the HCA with GUID 0x0003ba000100e38a and exits at port 2. The HCA is LID 14.
Note:
The actual output for your InfiniBand fabric will differ from that in the example.

14.1.5 Displaying the Link Status of a Node

If you want to know the link status of a node in the InfiniBand fabric, run the ibportstate command to display the state, width, and speed of that node:
On the command-line interface (CLI), run the following command:
# ibportstate lid port
where lid is the LID of the node in the fabric, port is the port of the node.
The output is displayed, as in the following example:
# ibportstate 15 23

PortInfo:
# Port info: Lid 15 port 23
LinkState:.......................Active
PhysLinkState:...................LinkUp
LinkWidthSupported:..............1X or 4X
LinkWidthEnabled:................1X or 4X
LinkWidthActive:.................4X
LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkSpeedActive:.................10.0 Gbps
Peer PortInfo:
# Port info: Lid 15 DR path slid 15; dlid 65535; 0,23
LinkState:.......................Active
PhysLinkState:...................LinkUp
LinkWidthSupported:..............1X or 4X
LinkWidthEnabled:................1X or 4X
LinkWidthActive:.................4X
LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkSpeedActive:.................10.0 Gbps
#
Note:
The actual output for your InfiniBand fabric will differ from that in the example.

14.1.6 Displaying Counters for a Node

To help ascertain the health of a node in the fabric, use the perfquery command to display the performance, error, and data counters for that node:
On the command-line interface (CLI), enter the following command:
# perfquery lid port
where lid is the LID of the node in the fabric, and port is the port of the node.
Note:
If a port value of 255 is specified for a switch node, the counters are the total for all switch ports.
For example:
# perfquery 15 23
#
# Port counters: Lid 15 port 23
PortSelect:......................23
CounterSelect:...................0x1b01
SymbolErrors:....................0
.
.
.
VL15Dropped:.....................0
XmtData:.........................20232
RcvData:.........................20232
XmtPkts:.........................281
RcvPkts:.........................281
Note:
The output in the example is just a portion of the full output.

14.1.7 Displaying Data Counters for a Node

To list the data counters for a node in the fabric, use the ibdatacounts command.
On the command-line interface (CLI), enter the following command:
# ibdatacounts lid port
where lid is the LID of the node in the fabric, and port is the port of the node.
For example:
# ibdatacounts 15 23
#
XmtData:.........................6048
RcvData:.........................6048
XmtPkts:.........................84
RcvPkts:.........................84
Note:
The actual output for your InfiniBand fabric will differ from that in the example.

14.1.8 Displaying Low-Level Detailed Information for a Node

If intensive troubleshooting is necessary to resolve a problem, you can use the smpquery command to display very detailed information about a node in the fabric.
On the command-line interface (CLI), enter the following command:
# smpquery switchinfo lid
where lid is the LID of the node in the fabric.
For example:
# smpquery switchinfo 15
#
# Switch info: Lid 15
LinearFdbCap:....................49152
RandomFdbCap:....................0
McastFdbCap:.....................4096
LinearFdbTop:....................16
DefPort:.........................0
DefMcastPrimPort:................255
DefMcastNotPrimPort:.............255
LifeTime:........................18
StateChange:.....................0
LidsPerPort:.....................0
PartEnforceCap:..................32
InboundPartEnf:..................1
OutboundPartEnf:.................1
FilterRawInbound:................1
FilterRawOutbound:...............1
EnhancedPort0:...................1
#
# smpquery portinfo lid port
Note:
The actual output for your InfiniBand fabric will differ from that in the example.

14.1.9 Displaying Low-Level Detailed Information for a Port

If intensive troubleshooting is necessary to resolve a problem, you can use the smpquery command to display very detailed information about a port.
On the command-line interface (CLI), enter the following command:
# smpquery portinfo lid port
where lid is the LID of the node in the fabric.
For example:
# smpquery portinfo 15 23
#
Mkey:............................0x0000000000000000
GidPrefix:.......................0x0000000000000000
Lid:.............................0x0000
SMLid:...........................0x0000
CapMask:.........................0x0
DiagCode:........................0x0000
MkeyLeasePeriod:.................0
LocalPort:.......................0
LinkWidthEnabled:................1X or 4X
LinkWidthSupported:..............1X or 4X
LinkWidthActive:.................4X
LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkState:.......................Active
PhysLinkState:...................LinkUp
LinkDownDefState:................Polling
ProtectBits:.....................0
LMC:.............................0
.
.
.
SubnetTimeout:...................0
RespTimeVal:.....................0
LocalPhysErr:....................8
OverrunErr:......................8
MaxCreditHint:...................85
RoundTrip:.......................16777215
#
Note:
The actual output for your InfiniBand fabric will differ from that in the example, and it is just a portion of the full output.

14.1.10 Mapping LIDs to GUIDs

In the InfiniBand fabric in Exalogic machines, as a Subnet Manager and Subnet administrator, you may want to assign subnet-specific LIDs to nodes in the fabric. Often in the use of the InfiniBand commands, you must provide an LID to issue a command to a particular InfiniBand device.
Alternatively, the output of a command might identify InfiniBand devices by their LID. You can create a file that is a mapping of node LIDs to node GUIDs, which can help with administrating your InfiniBand fabric.
Note:
Creation of the mapping file is not a requirement for InfiniBand administration.
The following procedure creates a file that lists the LID in hexadecimal, the GUID in hexadecimal, and the node description:
  1. Create an inventory file:
    # osmtest -f c -i inventory.txt
    The inventory.txt file can be used for other purposes too, besides this procedure.
  2. Create a mapping file:
    # cat inventory.txt |grep -e '^lid' -e 'port_guid' -e 'desc' |sed 's/^lid/\nlid/'> mapping.txt
  3. Edit the latter half of the mapping.txt file to remove the nonessential information. The content of the mapping.txt file looks similar to the following:
    lid 0x14
    port_guid 0x0021283a8620b0a0
    # node_desc Sun DCS 72 QDR switch 1.2(LC)
    lid 0x15
    port_guid 0x0021283a8620b0b0
    # node_desc Sun DCS 72 QDR switch 1.2(LC)
    lid 0x16
    port_guid 0x0021283a8620b0c0
    # node_desc Sun DCS 72 QDR switch 1.2(LC)
    
Note:
The output in the example is just a portion of the entire file.

14.1.11 Performing Comprehensive Diagnostics for the Entire Fabric

If you require a full testing of your InfiniBand fabric, you can use the ibdiagnet command to perform many tests with verbose results. The command is a useful tool to determine the general overall health of the InfiniBand fabric.
On the command-line interface (CLI), run the following command:
# ibdiagnet -v -r
The ibdiagnet.log file contains the log of the testing.

14.1.12 Performing Comprehensive Diagnostics for a Route

You can use the ibdiagpath command to perform some of the same comprehensive tests for a particular route.
On the command-line interface (CLI), run the following command:
# ibdiagpath -v -l slid dlid
where slid is the LID of the source node in the fabric, and dlid is the LID of the destination node.
The ibdiagpath.log file contains the log of the testing.

14.1.13 Determining Changes to the InfiniBand Topology

If your fabric has a number of nodes that are suspect, the osmtest command enables you to take a snapshot (inventory file) of your fabric and at a later time compare that file to the present conditions.
Note:
Although this procedure is most useful after initializing the Subnet Manager, it can be performed at any time.
Complete the following steps:
  1. Ensure that Subnet Manager is initiated.
  2. On the command-line interface (CLI), run the following command to take a snapshot of the topology:
    # osmtest -f c
    For example:
    # osmtest -f c
    Command Line Arguments
    Done with args
    Flow = Create Inventory
    Aug 13 19:44:53 601222 [B7D466C0] 0x7f -> Setting log level to: 0x03
    Aug 13 19:44:53 601969 [B7D466C0] 0x02 -> osm_vendor_init: 1000 pending umadsspecified
    using default guid 0x21283a8620b0f0
    Aug 13 19:44:53 612312 [B7D466C0] 0x02 -> osm_vendor_bind: Binding to port0x21283a8620b0f0
    Aug 13 19:44:53 636876 [B7D466C0] 0x02 -> osmtest_validate_sa_class_port_info:
    -----------------------------
    SA Class Port Info:
    base_ver:1
    class_ver:2
    cap_mask:0x2602
    cap_mask2:0x0
    resp_time_val:0x10
    -----------------------------
    OSMTEST: TEST "Create Inventory" PASS
    #
    
  3. After an event, compare the present topology to that saved in the inventory file, as in the following example:
    # osmtest -f v
    Command Line Arguments
    Done with args
    Flow = Validate Inventory
    Aug 13 19:45:02 342143 [B7EF96C0] 0x7f -> Setting log level to: 0x03
    Aug 13 19:45:02 342857 [B7EF96C0] 0x02 -> osm_vendor_init: 1000 pending umadsspecified
    using default guid 0x21283a8620b0f0
    Aug 13 19:45:02 351555 [B7EF96C0] 0x02 -> osm_vendor_bind: Binding to port0x21283a8620b0f0
    Aug 13 19:45:02 375997 [B7EF96C0] 0x02 -> osmtest_validate_sa_class_port_info:
    -----------------------------
    SA Class Port Info:
    base_ver:1
    class_ver:2
    cap_mask:0x2602
    cap_mask2:0x0
    resp_time_val:0x10
    -----------------------------
    Aug 13 19:45:02 378991 [B7EF96C0] 0x01 -> osmtest_validate_node_data: Checkingnode 0x0021283a8620b0a0, LID 0x14
    Aug 13 19:45:02 379172 [B7EF96C0] 0x01 -> osmtest_validate_node_data: Checkingnode 0x0021283a8620b0b0, LID 0x15
    .
    .
    .
    Aug 13 19:45:02 480201 [B7EF96C0] 0x01 ->osmtest_validate_single_path_rec_guid_pair:
    Checking src 0x0021283a8620b0f0 to dest 0x0021283a8620b0f0
    Aug 13 19:45:02 480588 [B7EF96C0] 0x01 -> osmtest_validate_path_data: Checkingpath SLID 0x19 to DLID 0x19
    Aug 13 19:45:02 480989 [B7EF96C0] 0x02 -> osmtest_run:
    ***************** ALL TESTS PASS *****************
    OSMTEST: TEST "Validate Inventory" PASS
    #
    
    Note:
    Depending on the size of your InfiniBand fabric, the output from the osmtest command could be tens of thousands of lines long.

14.1.14 Determining Which Links Are Experiencing Significant Errors

You can use the ibdiagnet command to determine which links are experiencing symbol errors and recovery errors by injecting packets.
On the command-line interface (CLI), run the following command:
# ibdiagnet -c 100 -P all=1
In this instance of the ibdiagnet command, 100 test packets are injected into each link and the -P all=1 option returns all counters that increment during the test.
In the output of the ibdiagnet command, search for the symbol_error_counter string. That line contains the symbol error count in hexadecimal. The preceding lines identify the node and port with the errors. Symbol errors are minor errors, and if there are relatively few during the diagnostic, they can be monitored.
Note:
According to the InfiniBand specification 10E-12 BER, the maximum allowable symbol error rate is 120 errors per hour.
In addition, in the output of the ibdiagnet command, search for the link_error_recovery_counter string.
That line contains the recovery error count in hexadecimal. The preceding lines identify the node and port with the errors. Recovery errors are major errors and the respective links must be investigated for the cause of the rapid symbol error propagation.
Additionally, the ibdiagnet.log file contains the log of the testing.

14.1.15 Checking All Ports

To perform a quick check of all ports of all nodes in your InfiniBand fabric, you can use the ibcheckstate command.
On the command-line interface (CLI), run the following command:
# ibcheckstate -v
The output is displayed, as in the following example:
# Checking Switch: nodeguid 0x0021283a8389a0a0
Node check lid 15: OK
Port check lid 15 port 23: OK
Port check lid 15 port 19: OK
.
.
.
# Checking Ca: nodeguid 0x0003ba000100e388
Node check lid 14: OK
Port check lid 14 port 2: OK
## Summary: 5 nodes checked, 0 bad nodes found
## 10 ports checked, 0 ports with bad state found
#
Note:
The ibcheckstate command requires time to complete, depending upon the size of your InfiniBand fabric. Without the -v option, the output contains only failed ports. The output in the example is only a small portion of the actual output.

14.2 Controlling the InfiniBand Fabric

This section contains the following topics:

14.2.1 Clearing Error Counters

If you are troubleshooting a port, the perfquery command provides counters of errors occurring at that port. To determine if the problem has been resolved, you can reset all of the error counters to 0 with the ibclearerrors command.
On the command-line interface (CLI), run the following command:
# ibclearerrors
The output is displayed, as in the following example:
## Summary: 5 nodes cleared 0 errors
#

14.2.2 Clearing Data Counters

When you are optimizing the InfiniBand fabric for performance, you might want to know how the throughput increases or decreases according to changes you are making to the fabric and Subnet Manager. The ibclearcounters command enables you to reset the data counters for all ports to 0.
On the command-line interface (CLI), run the following command:
# ibclearcounters
The output is displayed, as in the following example:
## Summary: 5 nodes cleared 0 errors
#

14.2.3 Resetting a Port

You might need to reset a port to determine its functionality.
On the command-line interface (CLI), run the following command:
# ibportstate lid port reset
where lid is the LID of the node in the fabric, and port is the port of the node.
For example:
# ibportstate 15 23 reset
Initial PortInfo:
# Port info: Lid 15 port 23
LinkState:.......................Down
PhysLinkState:...................Disabled
LinkWidthSupported:..............1X or 4X
LinkWidthEnabled:................1X or 4X
LinkWidthActive:.................4X
LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkSpeedActive:.................2.5 Gbps
After PortInfo set:
# Port info: Lid 15 port 23
LinkState:.......................Down
PhysLinkState:...................Disabled
After PortInfo set:
# Port info: Lid 15 port 23

LinkState:.......................Down
PhysLinkState:...................PortConfigurationTraining
#

14.2.4 Setting Port Speed

You can manually set the speed of a single port to help determine symbol error generation. The ibportstate command can set the speed to 2.5, 5.0, or 10.0 GB/sec.
On the command-line interface (CLI), run the following command:
# ibportstate lid port speed <value>
where lid is the LID of the node in the fabric, port is the port of the node, and <value> is the speed you want to set.
Note:
Adding speed values enables either speed. For example, speed 7 is 2.5, 5.0, and 10.0 GB/sec.
For example:
# ibportstate 15 23 speed 1
Initial PortInfo:
# Port info: Lid 15 port 23
LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
After PortInfo set:
# Port info: Lid 15 port 23
LinkSpeedEnabled:................2.5 Gbps
# ibportstate 15 23 speed 7
Initial PortInfo:
# Port info: Lid 15 port 23
LinkSpeedEnabled:................2.5 Gbps
After PortInfo set:
# Port info: Lid 15 port 23
LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
#

14.2.5 Disabling a Port

If a port is found to be problematic due to a bad cable connection or a physical damage to the connectors, you can disable the port.
On the command-line interface (CLI), run the following command:
# disableswitchport [--reason=reason] connector|ibdev port
where reason is the reason for disabling the port, Blacklist or Partitionconnector is the number of the QSFP connector (0A–15B). ibdev is the InfiniBand device name (Switch, Bridge-0-0, Bridge-0-1, Bridge-1-0, Bridge-1-1). port is the number of the port (1–36).
This hardware command disables a QSFP connector and port on the switch chip or a port on the BridgeX chips. The command addresses either the connector or the port on the switch chip or the BridgeX port.
The --reason option enables you to use a passphrase to lock the state of the port:
  • Blacklist – A connector and port pair are identified as being inaccessible because of unreliable operation.
  • Partition – A connector and port pair are identified as being isolated from the InfiniBand fabric.
Both the Blacklist and Partition passphrases survive reboot. You unlock these passphrases using the enableswitchport command with the --reason option.
Note:
State changes made with the ibportstate command are not recognized by the disableswitchportenableswitchport, or listlinkup commands.
The following example shows how to disable and blacklist connector 14A with the disableswitchport command.:
# disableswitchport --reason=Blacklist 14A
Disable Switch port 7 reason: Blacklist
Initial PortInfo:
# Port info: DR path slid 65535; dlid 65535; 0 port 7
LinkState:.......................Down
PhysLinkState:...................Polling
LinkWidthSupported:..............1X or 4X
LinkWidthEnabled:................1X or 4X
LinkWidthActive:.................4X
LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkSpeedActive:.................2.5 Gbps
After PortInfo set:
# Port info: DR path slid 65535; dlid 65535; 0 port 7
LinkState:.......................Down
PhysLinkState:...................Disabled
#
Note:
After fixing the cable connection or any connector problems, you should enable the port.

14.2.6 Enabling a Port

After fixing any connection- or connector-related problem related to a port, you should enable the port with the enableswitchport command.
On the command-line interface (CLI), run the following command:
enableswitchport [--reason=reason] connector|ibdev port
where reason is the reason for disabling the port, connector is the number of the QSFP connector (0A–15B), ibdev is the InfiniBand device name (Switch, Bridge-0-0, Bridge-0-1, Bridge-1-0, Bridge-1-1), and port is the number of the port (1–36).
For example:
# enableswitchport --reason=Blacklist 14A
Enable Switch port 7
Initial PortInfo:
# Port info: DR path slid 65535; dlid 65535; 0 port 7
LinkState:.......................Down
PhysLinkState:...................Disabled
LinkWidthSupported:..............1X or 4X
LinkWidthEnabled:................1X or 4X
LinkWidthActive:.................4X
LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkSpeedEnabled:................2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkSpeedActive:.................2.5 Gbps
After PortInfo set:
# Port info: DR path slid 65535; dlid 65535; 0 port 7
LinkState:.......................Down
PhysLinkState:...................Polling
#

14.3 For More Information

For more information about Sun Network QDR InfiniBand Gateway Switches, see the product documentation at the following URL:

No comments:

Post a Comment