NERSC logo National Energy Research Scientific Computing Center
  A DOE Office of Science User Facility
  at Lawrence Berkeley National Laboratory
 

Topspin (IP Over Infiniband)


Results taken: 11/2003 - 12/2003

    Test Configuration

      SWITCH/PORTS SETUP:

        Dell 5224 (gusgs06, Ports 1-4) <=(4)=> Extreme 7i
        VLAN 310: Ports 5-12,21-24
        VLAN 311: Ports 13-16
        VLAN 312: Ports 17-20

        Topspin (gusic03, Port 2/1) <-> Dell Port 17
        Topspin (gusic03, Port 2/2) <-> Dell Port 21

        guscn25 (eth2) <--> Dell (Port 5)
        guscn26 (eth2) <--> Dell (Port 6)

      Software Version:

        Topspin TS90 OS Version: 1.1.3-build703

      TEST NODES: guscn25, guscn26

        Both nodes are configured with:
        • Supermicro P4DP6 motherboards with six PCI-X slots, two of which are 133 MHz capable
        • dual 2.2 GHz Pentium IV Prestonia Xeon CPUs
        • 2 GB of DDR PC2100 ECC memory
        • dual on-board Intel PRO/100 Ethernet interfaces
        • one Intel PRO/1000 XT 133 MHz PCI-X Gigabit Ethernet NIC
        • Topspin 4x HCA with IPoIB (IP over Infiniband) Driver
        • Redhat 7.3 Linux 2.4.20-20.7smp kernel

      TEST SOFTWARE:

        iperf version 1.7.0 (13 Mar 2003) pthreads (default settings, except TCP_WINDOW_SIZE)

    Test Results

      10/100 <-> 10/100 (64 subnet, routed within a 10/100 switch):
            TCP_WINDOW_SIZE=85.3 KB (default)
        
            guscn25 -> guscn26   T1: 94.1 Mb/s
            guscn26 -> guscn25   T1: 94.1 Mb/s
                                 T2: 94.1 Mb/s
                                 T4: 94.1 Mb/s
        
            Second Try:
            guscn25 -> guscn26   T1: 94.1 Mb/s, 94.1 Mb/s
                                 T2: 94.1 Mb/s, 94.1 Mb/s
                                 T4: 94.1 Mb/s, 94.1 Mb/s
            guscn26 -> guscn25   T1: 94.1 Mb/s, 94.1 Mb/s
                                 T2: 94.1 Mb/s, 94.1 Mb/s
                                 T4: 94.1 Mb/s, 94.1 Mb/s
        
            TPC_WINDOW_SIZE=256 KB
        
            guscn25 -> guscn26   T1: 94.1 Mb/s, 94.1 Mb/s
                                 T2: 94.1 Mb/s, 94.1 Mb/s
                                 T4: 94.1 Mb/s, 94.1 Mb/s
        
        
            Summary: 10/100 performance is 94 Mb/s, independent of how
            many iperf processes/treads (T1, T2, T4) were using.
        
      GigE <-> GigE
        Same 68 subnet (routed within a Dell switch):
              TCP_WINDOW_SIZE=85.3 KB (default)
          
              guscn25-ge0 -> guscn26-ge0   T1: 633 Mb/s, 632 Mb/s
                                           T2: 989 Mb/s, 989 Mb/s
                                           T4: 990 Mb/s, 990 Mb/s
              guscn26-ge0 -> guscn25-ge0   T1: 630 Mb/s, 630 Mb/s
                                           T2: 989 Mb/s, 989 Mb/s
                                           T4: 990 Mb/s, 990 Mb/s
          
              TCP_WINDOW_SIZE=256 KB
          
              guscn25-ge0 -> guscn26-ge0   T1: 990 Mb/s, 990 Mb/s
                                           T2: 990 Mb/s, 990 Mb/s
                                           T4: 990 Mb/s, 990 Mb/s
          
          
          
        Between 52 and 68 subnets (Dell <-> E7i <-> Dell):
              guscn25-ge0 -> guscn26-ge0   T1: 360 Mb/s, 362 Mb/s
                                           T2: 671 Mb/s, 672 Mb/s
                                           T4: 989 Mb/s, 989 Mb/s
              guscn26-ge0 -> guscn25-ge0   T1: 361 Mb/s, 360 Mb/s
                                           T2: 689 Mb/s, 689 Mb/s
                                           T4: 989 Mb/s, 989 Mb/s
          
             Summary: iperf was able to achieve ~990 Mb/s with GigE but
             it required more than one iperf process/thread. With only one 
             iperf process (T1), the performance was lower (633 Mb/s if the 
             network traffic was within the same Dell switch and 360 Mb/s 
             if the traffic needed to hop through Extreme 7i switch).  
          
             With enough number of iperf processes, it will saturate
             the GigE pipe.
          
             By increasing TCP_WINDOW_SIZE to 256 KB, iperf was able to 
             achieve 990 MB/s with one iperf thread.
          
      IPoIB <-> IPoIB (52 subnet, routed within the Topspin switch):
         
           TCP_WINDOW_SIZE=85.3 KB (default)
        
             guscn25-ib0 -> guscn26-ib0  T1: 967 Mb/s, 989 Mb/s
                                         T2: 965 Mb/s, 966 Mb/s
                                         T4: 918 Mb/s, 917 Mb/s
             guscn26-ib0 -> guscn25-ib0  T1: 918 MB/s, 888 Mb/s
                                         T2: 991 Mb/s, 988 Mb/s
                                         T4: 939 Mb/s, 938 Mb/s
                                         T8: 923 Mb/s, 920 Mb/s
        
           Summary: The IPoIB performance was disappointing. It was even
           lower than what GigE could do.
        
      IPoIB <-> GigE
        Between 52 and 68 subnets (Topspin bridging <-> Dell <-> E7i <-> Dell):
             TCP_WINDOW_SIZE=85.3 KB (default)
          
               guscn25-ib0 -> guscn26-ge0  T1: 495 Mb/s, 495 Mb/s
                                           T2: 755 Mb/s, 754 Mb/s
                                           T4: 758 Mb/s, 759 Mb/s
               guscn26-ge0 -> guscn25-ib0  T1: 797 Mb/s, 796 Mb/s
                                           T2: 784 Mb/s, 790 Mb/s
                                           T4: 785 Mb/s, 771 Mb/s
          
             Summary: This was to measure the performance bridging between
             IB and GigE, with IPoIB and GigE on different subnets. Extreme 7i
             was used to route the network traffic between the two subnets. 
             The performance was disappointing at around 800 Mb/s.  Also,
             the numbers indicated faster performance from GigE to IB then
             from IB to GigE.  The reason is not clear.
          
        Same 68 subnet (Topspin bridging <-> Dell):
             TCP_WINDOW_SIZE=85.3 KB (default)
          
               guscn25-ib0 -> guscn26-ge0  T1: 561 Mb/s, 559 Mb/s
                                           T2: 766 Mb/s, 767 Mb/s
                                           T4: 767 Mb/s, 766 Mb/s
               guscn26-ge0 -> guscn25-ib0  T1: 794 Mb/s, 794 Mb/s
                                           T2: 780 Mb/s, 781 Mb/s
                                           T4: 784 Mb/s, 783 Mb/s
          
             Summary: This was to measure the performance bridging between
             IB and GigE, with IPoIB and GigE on the same subnet (without
             hopping through Extreme 7i switch). The performance was also 
             disappointing.  Again, it was faster from GigE to IB than from
             IB to GigE.