[an error occurred while processing this directive]

NERSC 3 Greenbook

next up previous contents
Next: Symmetric Multiprocessors Up: Hardware Previous: Parallel Vector Supercomputers

Massively Parallel Processing Supercomputers

 Distributed memory Massively Parallel Processors (MPP) systems based on commodity chips have become increasingly important to supercomputing because of the low price performance ratios and because systems of this kind can offer extremely large memory. For suitable problems, and in particular for very memory intensive problems, these systems are capable of much higher computational performance and they are able to address certain Grand Challenge and Capability computing problem that cannot be done with PVP systems in any reasonable time frame.

The HPCAC T3E MPP is one of the largest configurations available anywhere. The system installed in July of 1997 is a 512 processor T3E-900, with 1.5 Terabytes of disk and 132 GB of memory. At the time of delivery, it was the largest I/O configuration ever built. This system has a peak performance of 900 Mflops per processor for a theoretical peak of slightly less than 0.5 Teraflops. The system has a measured performance of more than 0.25 Teraflops (265 Gflops on 512 processors). It is allocated to Grand Challenge and very large capability problems. In FY 98, the HPCAC will provide roughly 5 times the computing resource on the MPP systems[*] as exists on the PVP systems. Currently the MPP user base is smaller[*], and many applications will require restructuring to effeciently use these MPP systems.

There also exists Shared Memory Processors that are not Parallel Vector, such as the IBM SP and SGI Origin, and Sun HPC series. NERSC currently has prototype systems using this architecture (a pair of Sun Enterprise 4000 systems) and in FY 98 will receive a 64-node Origin 2000 to evaluate the state of a hybrid architecture for clustering SMPs. Indeed, the J-90 complex is already an early implementation of this style of computing. It is believed these systems will provide both fine and coarse grain parallelism in a single application.

Future directions of this system include the incorporation of clustered SMPs. This could be clustering using custom switches within an MPP to clustering using commodity network components. The number of processors per SMP is also a technical challenge, whether there will be a small number (2, 4 or 8) or a large number (32 or 64) of processors sharing local memory within an SMP node. One path, small SMPs clustered with specialized connections could be viewed as an evolution of today's MPP architecture. The other path, large SMPs clustered with commodity or custom networks, could be an evolution of today's SMP architecture. Adding the complexity of providing distributed memory access in a way that enables vector computing creates a large range of possible paths.

NERSC has begun exploring these directions in preparation for the next major acquisition in 1999, and the one in 2001/2002. These acquisitions could lead to a system with hundreds of processors that has Teraflops of measured performance in 1999, and 10s of Teraflops in 2002.

In all these scenarios, the programming methods are different than for the PVP or current SMP systems. In order to do increasingly more demanding problems that track the computational price performance curve, HPCAC will have to work with clients and providers to develop system and programming environment software which allows applications to capture a significant part of the aggregated computational power these new systems provide. New models of computing services are also needed to fully realize the potential of the new technology.


NERSC 3 Greenbook

next up previous contents
Next: Symmetric Multiprocessors Up: Hardware Previous: Parallel Vector Supercomputers
Rick A Kendall
7/13/1998