GUPFSGUPFS HomeOverview Testbed Technology Results Documents/Downloads Contact Us Links |
The GUPFS project uses a testbed system to conduct investigations and evaluations of the component technologies needed for a center-wide shared file system, and to explore these components interactions. In addition to these uses, we have employed the testbed to develop the GUPFS benchmark methodology and the actual benchmark codes used to conduct the technology evaluations. The testbed continues to be a useful resource in attracting the attention of component technology vendors and developing relationships with a number of these vendors. The testbed we used during FY 2003 was the expanded testbed upgraded at the end of FY 2002. This upgrade is detailed in the GUPFS Project FY 2002 report. This testbed was designed and built to provide sufficient hardware and computational resources to support the evaluation of multiple new component technologies, and to provide the underlying SAN fabric and storage resources with sufficient aggregate performance to stress-test existing and emerging shared file system technologies. This design emphasized the extensibility of the testbed system in order to accommodate future technology developments. In this regard, the testbed proved to be very effective. A variety of shared-file systems were successfully tested, a number of fabric components were integrated and tested throughout the year, and additional storage solutions were bought in and evaluated. The base configuration of the GUPFS testbed during FY 2003, and the changes in that configuration throughout the year are presented in the following sections. 1. FY 2003 Initial Testbed Configuration The GUPFS FY 2003 testbed system presented a microcosm of a parallel scientific cluster — dedicated computational nodes, special-function service nodes, and a high-speed interconnect for message passing. It used an internal jumbo frame Gigabit Ethernet as the primary high-speed message passing interconnect. An internal 10/100 Mb/s Fast Ethernet LAN was employed for system management and NFS distribution of the user home file systems. The testbed supplied Fibre Channel as the base SAN fabric, as well as Fibre Channel storage, and a variety of alternative fabrics and bridges between these fabrics. During FY 2003, the testbed was configured as a Linux
parallel scientific cluster, with a management node, a core set of 16 dedicated
dual Pentium-4 compute nodes, a set of six special-purpose dual Pentium-4
nodes, and a reserve of five auxiliary dual Pentium-3 compute nodes from the
original FY 2002 testbed. The Fibre Channel SAN fabric was expanded extensively
with the addition of two 16-port 2 Gb/s FC switches. The Gigabit Ethernet used
as the message passing interconnect for parallel jobs was also expanded to
support the increased number of nodes and iSCSI testing. A picture of the FY
2003 testbed appears on the following page as Figure 1. The FY 2003 testbed
configuration is shown in Figure 2.
Figure 1. The FY 2003 testbed, with the NetStorager shown in front. The following major components were included in the FY 2003 testbed: System nodes · Twenty-two dual Pentium-4 nodes: sixteen in 2U cases and six in
4U cases (these are described in greater detail later in this section) · Six dual Pentium-3 nodes in 4U cases Fabric ·
Ethernet o
One 32-port Extreme 7i
Gigabit Ethernet switch o
One 16-port Extreme 5i
Gigabit Ethernet switch o
Two 10/100 Ethernet
switches for system management ·
Fibre Channel o
Two 16 port 2 Gb/s Fibre
Channel Switches (Brocade SilkWorm 3800 and Qlogic SANbox2-16) o
One 16 port 1 Gb/s Fibre
Channel Switch (Brocade SilkWorm 2800) o
One Cisco SN5428 iSCSI
Router fabric bridge to Ethernet ·
Myrinet o
One Myrinet 2000 8-port
switch with eight host interface cards ·
InfiniBand o
One InfiniCon ISIS InfinIO
7000 1x InfiniBand switch, with eight 1x HCA host adapters, and fabric bridge
modules for Fibre Channel and Gigabit Ethernet Storage ·
A EMC CLARiiON
CX600 disk subsystem ·
A Dot Hill 7124 RAID disk subsystem ·
A Silicon Gear Mercury II RAID subsystem ·
A Chaparral A8526 RAID subsystem with
attached storage
Figure 2. FY 2003 base GUPFS testbed configuration. The new Pentium-4 nodes all utilize the same motherboard and are configured similarly. The only differences among them are the sizes of the cases in which they are installed. Sixteen of the new technology nodes were put in 2U cases in order to save space, eliminating the need to buy more than one additional cabinet. Six of the new technology nodes were put in 4U cases to allow standard-height-profile peripheral component interconnect (PCI) cards to be installed for the Myrinet 2000 Host interfaces, Intel PRO/1000 T IP iSCSI cards, and early 1x InfiniBand HCAs for the InfiniCon InfinIO 7000 fabric bridge. The need to have PCI-X slots on the motherboard to support the high-performance FC, InfiniBand, and Gigabit Ethernet cards dictated the class of motherboards and processors that were acquired, as the only motherboards available with PCI-X buses were relatively high-end server motherboards. ·
All Pentium-4 nodes,
regardless of the size of their case had the same base configuration. This
configuration consisted of the same motherboards, dual 2.2 GHz Pentium IV
Prestonia Xeon CPUs, 2 GB of DDR memory, 10/100 and Gigabit Ethernet
interfaces, 36 GB SCSI disks, and Qlogic 2340 2 Gb/s Fibre Channel HBAs. ·
All Pentium-3 nodes had the
same base configuration. This consisted of identical motherboards, dual Pentium
III 1 GHz CPUs, 1 GB of memory, 10/100 and Gigabit Ethernet interfaces, and 18
GB SCSI disks. One of the Pentium-3 nodes was configured as a management node
and had additional 10/100 and Gigabit Ethernet interfaces. The remaining five
Pentium-3 nodes were configured as auxiliary compute nodes with Qlogic 2310 2
Gb/s Fibre Channel HBAs All Pentium-4 nodes, regardless of
the size of their case, are configured with: · Supermicro P4DP6 motherboards with six PCI-X slots, two of which are 133 MHz capable · Dual 2.2 GHz Pentium IV Prestonia Xeon CPUs · 2 GB of DDR PC2100 ECC memory · Dual onboard Intel PRO/100 Ethernet interfaces · Dual onboard U160 Adaptec SCSI controllers · Onboard VGA graphics · One 36 GB Ultra 160 LVD 10K RPM SCSI disk drive · One Qlogic qla2340 133 MHz PCI-X Fibre Channel HBA (low or standard profile) · One Intel PRO/1000 XT 133 MHz PCI-X Gigabit Ethernet NIC (low or standard profile) All Pentium-3 nodes shared a common
base configuration. The management/interactive node and the computational nodes
differed only in that the computational nodes contained a 2-Gb/s Fibre Channel
interface card, while the management node contained additional Fast Ethernet
cards and an additional Gigabit Ethernet card. All of the Pentium-3 nodes were
installed in 4U rack mount cases and had the following base configuration: · Intel Server Board STL2 motherboards, with two 64-bit 66 MHz PCI slots with additional 32-bit PCI slots · Dual Pentium III 1 GHz CPUs · 1 GB of PC133 ECC memory · Onboard VGA (video graphics array) graphics · Onboard U160 Adaptec SCSI controllers · One onboard Intel PRO/100 Ethernet interface · One 18 GB Ultra 160 LVD 10 k RPM SCSI disk drive · One Qlogic qla2200 64-bit Fibre Channel Optical HBA (compute nodes only) · One Intel PRO/1000 T 64-bit PCI Gigabit Ethernet NIC (the management node had two) The identical base configuration of all the Pentium-4 nodes, including those intended as special-purpose nodes, allowed them to be used at times as compute nodes for the purpose of scalability testing. Four of the new nodes were configured to be special-purpose nodes. These special-purpose nodes had the same basic hardware and software configuration as the dedicated compute nodes. The four special-purpose nodes are configured to perform the following functions: ·
Code development and benchmark debugging ·
Metadata and lock manager services ·
A dedicated installation target for developing and testing
new kickstart configurations ·
A storage server for testing distribution of shared file
system with NFS gateways Two additional Pentium-4 nodes were initially reserved for transient special-purpose usage, such as running the InfiniCon InfiniBand subnet manager, which initially ran under Windows 2000. When InfiniCon’s subnet manager became able to run under Linux, both of these nodes were reconfigured as general purpose compute nodes and were usable in evaluations. The increased scale, many advanced technology components, and flexible and expandable design of the updated GUPFS testbed will enable many interesting and important evaluations to be conducted over the next several years. These evaluations should lead to the selection of the best and most appropriate component technologies for the rollout of a high-performance shared file system during the second phase (FY 2005–2006) of the GUPFS project. All testbed nodes ran Linux, based on the RedHat 7.1, 7.2, 7.3, 8.0, or 9.0 distribution, depending on the requirements of the file system tested. The testbed supported parallel job submission and execution using Open PBS and utilized MPICH as the MPI implementation for parallel jobs. Portland Group C, C++, and various flavors of FORTRAN compilers provided the compilation and execution environment for the parallel jobs. Independent Linux systems on individual nodes are automatically installed through PXEboot kickstart mechanisms. This allowed for multiple, completely different system images to be present on each of the nodes, enabling rapid reconfiguration of the testbed so that it could quickly switch among different software environments, each of which was needed to conduct a different evaluation. All nodes except the management node were connected to a switched 2 Gb/s Fibre Channel SAN fabric by 2 Gb/s Qlogic 2300 family FC Host Bus Adapters (HBAs). The 2 Gb/s FC fabric was entirely optical. The 1 Gb/s FC copper fabric was retired and replaced with an optical fabric, except as necessary to attach the original 1 Gb/s FC storage to the 1 Gb/s Brocade SilkWorm 2800 switch. In addition to the testbed nodes being attached to the Fibre Channel SAN fabric, the various storage devices, both permanent and under evaluation, were attached to the same switched FC fabric, as were a number of fabric bridges. These fabric bridges included the Cisco SN5428 Storage Router bridging between Gigabit Ethernet iSCSI storage traffic from hosts and FC fabric attached storage devices, and the InfiniCon InfinIO fabric bridge between InfiniBand attached hosts and FC fabric attached storage. The disk storage devices connected to the Fibre Channel SAN fabric were: ·
A dual-controller DotHill 7124 RAID subsystem, with
an expansion cabinet ·
A dual-controller Silicon Gear Mercury II RAID subsystem ·
A single-controller Chaparral A8526 RAID subsystem with
attached storage · A dual-controller EMC CLARiiON CX 600 RAID subsystem with storage Each of the RAID controllers had two or more Fibre Channel ports for connecting to the switch, and these FC ports could be used simultaneously. All four storage devices supported various RAID configurations and utilized similar 10,000 RPM 73 GB disk drives. The DotHill contained 20 drives, the Silicon Gear 12 drives, the Chaparral 10 drives, and the EMC 30 drives. Total unformatted SAN attached storage capacity was approximately 5.3 terabytes (TB), with a nominal maximum of 4.2 TB of formatted RAID 5 storage. The DotHill and Silicon Gear were both limited to 1 Gb/s FC interfaces, while the Chaparral and EMC supported 2 Gb/s FC interfaces The storage configuration was chosen to enable the exploration of the relative performance, reliability, and interoperability of multiple storage vendors’ products. The quantity and character of the storage was dictated by: ·
The technology available at the time of acquisition ·
The desire to be able to achieve maximum performance from
each storage controller ·
The desire to be able to explore the Linux support for file
systems greater than 2 TB on 32-bit architectures when such support became
available ·
Price 1.2 Testbed Configuration Changes during FY 2003 A number of changes in the testbed configuration occurred during FY 2003. These included upgrading the InfiniCon InfinIO switch and HCAs from 1x (2.5 Gb/s) to 4x (10 Gb/s) InfiniBand, exchanging the Myrinet 2000 PCI based Rev C host interface adapters for higher performance PCI-X Rev D cards, and connecting the Alvarez cluster 10/100 management network with the GUPFS testbed Gigabit Ethernet fabric. Another modification was the exchange of three of the five Intel iSCSI HBAs for a newer version that could run with more up to date Linux kernels, allowing the iSCSI HBAs to be tested in conjunction with the newer file systems. The Myrinet 2000 host adapter exchange allowed all eight Myrinet 2000 host adapters to be installed and the full capabilities of the testbed Myrinet fabric to be investigated. The new Rev D adapters were available in low profile form factor (2U) which allowed all of them to be installed in nodes. The previous adapters were full height (4U). Since the testbed only had six 4U Pentium-4 nodes, only six of the eight original adapters could be installed. Once all eight Rev D adapters were installed, the GUPFS project proceeded with plans to install the GUPFS 8 port Myrinet switch blade with the Alvarez Myrinet switch, making the eight connected GUPFS nodes part of the Alvarez system. This allowed testing of GPFS 1.3 for Linux using Alvarez compute nodes with GUPFS nodes as high-performance Network Storage Devices (NSDs) in lieu of Alvarez’s low performance storage. This also allowed GPFS testing at a large scale (64 or more nodes) in conjunction with 1600 MB/s storage bandwidth, a combination unachievable by either system alone. In addition, it provided us an opportunity to begin investigating shared file systems running on multiple systems. During FY 2003, the GUPFS testbed InfiniBand configuration received several updates. InfiniCon upgraded the HCAs and switch modules to 4X (10 Gb/s), enabling early investigation of storage transfers over 4X InfiniBand. As in the case of the 1x IB HCAs, the initial 4x HCAs were full height (4U). Because the 4x HCAs required PCI-X slots, and because the testbed only had six 4U Pentium-4 nodes with PCI-X slots, the InfiniBand fabric deployment was limited to six systems, although components were available for eight systems. In one of a number of software upgrades, it became possible to run the InfiniCon IB subnet manager under Linux. This allowed the system running the subnet manager under Windows 2000 to be converted back to Linux and used as a compute node in evaluations. A second InfiniCon hardware upgrade brought in a second generation of 4x HCAs. In addition to providing higher performance, these HCAs were available in low profile (2U) form factor. This made it possible for us to install all eight of the new HCAs in 2U Pentium-4 nodes, to fully populate the configuration for the first time. This allowed more meaningful scalability numbers to be obtained, enabling more direct and clear interconnect/fabric performance comparisons. |
![]() |
Page last modified: Tue, 22 Jun 2004 22:50:18 GMT Page URL: http://www.nersc.gov/projects/GUPFS/testbed/GUPFS_testbed03.php Web contact: webmaster@nersc.gov Computing questions: consult@nersc.gov Privacy and Security Notice |
![]() |