[an error occurred while processing this directive]

NERSC 3 Greenbook

next up previous contents
Next: Energy Research User Community Up: User Requirements for the Energy Research Computational Community Previous: User Requirements for the

Background and Introduction

The Energy Research Supercomputer Users Group (ERSUG) encompasses all investigators utilizing the High Performance Computing Access Center (HPCAC) resources of the Department of Energy Office of Energy Research. At the June 1995 meeting held at the National Energy Research Scientific Computing (NERSC) facility, the ERSUG executive committee (ExERSUG) determined the need to reassess the role of computational science in Energy Research (ER) programs. The continuing rapid development of the computational science fields, computer technology (both hardware and software), and the fiscal realities of the 1990s demand frequent periodic review of user requirements and the role that computational science should play in meeting ER program commitments.

It is clear from the current environment at NERSC and at the National Science Foundation (NSF) funded supercomputer centers that the demand for computational resources greatly exceeds the supply of those resources. The CRAY-C90 was a saturated resource only a few days after it was made available to the ER community at NERSC. The advent of Massively Parallel Processing (MPP) supercomputers and Symmetric Multiprocessing (SMP) systems at supercomputer sites have increased both the capabilities and capacity of supercomputer centers but have only marginally met the aggregate demand for computational resources.

When the last user requirements update was written, 1991, parallel supercomputers, in particular MPPs, had just started becoming viable alternatives to vector supercomputers for certain classes of problems. The efforts of the computational sciences and computer sciences community have positioned MPPs as full partners with traditional vector parallel architectures as evidenced by offerings from the vendors such as SGI/CRI and IBM. The explosive growth of workstation processor technology coupled with specialized communication designs have been the mainstay of the hardware technologies in these MPP systems. MPP architectures based on workstation processor technology coupled with standard bus based shared memory systems (i.e., the SMP workstation) have been approaching the throughput performance of traditional vector supercomputers from the other end of the parallel computing spectrum. The evolution of MPP supercomputers by the vendor community benefits from both the fully distributed technology and the cluster of SMP approach. Currently the most popular MPP supercomputers are of the fully distributed class, SGI/CRI T3E, IBM SP. The clustered SMP approach is represented by the SGI Power Challenge Array, Origin-2000, SUN HPC-10000 and the Convex Exemplar.

The difficulty in using MPP systems still lies in adapting the relevant application software to efficiently exploit the distributed resources available (CPU, memory bandwidth and capacity, I/O). The key issue for developing efficient MPP applications is the access to the non-uniform memory: cache, local and remote memory. This issue includes both the speed of access and the methods of access to the memory architecture. Both the fully distributed and clustered SMP MPPs require labor intensive re-engineering or redesign and re-implementation of algorithms in most fields of computational science. This labor intensive trend is very much like the transition that occurred in the computational science fields when vector supercomputers were first introduced in the late 1970s. The relatively small local memory (e.g. 256 MB) available on the fully distributed MPPs poses special problems for many applications that are partially alleviated in the cluster-of-SMPs architecture.

There are at least four underlying components to the computational resources used by ER researchers.

In addition to the availability from the vendor community, these components determine the implementation and direction of the development of the supercomputing resources for the ER community.

The ER community has an interest in all of these components. ER funded programs are vitally interested in and contribute to new high-performance computing technology. The ER community defines and does research in ``Grand Challenge'' problems that require the development and integration of both new production computing resources and advanced software technologies and algorithms. The Energy Sciences Network is the electronic nervous system that connects ER scientists to the world-wide scientific community and to distributed computational resources that allow ER scientists to complete their programmatic missions.

In this document we will classify the user community of the ER HPCAC resources, focused on the NERSC facility (section 2), delineate a subset of the important scientific areas that require these resources (section 3), and describe the mix of the centralized and local resources (both current and future technology) that are necessary to the vitality of the ER funded research programs (section 4). We will not attempt to measure the actual ER programmatic needs in terms of hours of computer time since they could never be completely met but we do intend to make recommendations (section 5) that will facilitate more effective utilization of current and future resources in the realm of fixed or declining budget scenarios of the 1990s.


NERSC 3 Greenbook

next up previous contents
Next: Energy Research User Community Up: User Requirements for the Energy Research Computational Community Previous: User Requirements for the
Rick A Kendall
7/13/1998