Science Highlights banner
SciDAC header

Terascale Optimal PDE Simulations

The Terascale Optimal PDE Simulations (TOPS) ISIC is researching and developing and will deploy a toolkit of open-source solvers for the nonlinear partial differential equations (PDEs) that arise in many application areas, including fusion, accelerator design, global climate change, and the collapse of supernovae. These algorithms aim to reduce computational bottlenecks by one or more orders of magnitude on terascale computers, enabling scientific simulation on a scale heretofore impossible.

Volume-rendered image showing surface of maximum heat release
Figure 3   Memory bandwidth benchmark results. Asterisks show the model bandwidth computed for each of the last eight bars (non-standard STREAM kernels). (Click on image for larger version.)

One of the major TOPS activities in 2002 involved the magnetohydrodynamics (MHD) code M3D. A Hypre algebraic multigrid solver was ported into M3D underneath the existing PETSc interface, and scalability studies were done on M3D production runs. PETSc itself, a suite of data structures and routines for solving PDEs, is undergoing performance tuning and testing on terascale applications.

The Berkeley Benchmarking and Optimization Group (BeBOP) successfully achieved over 80% of the modeled peak Mflop/s for performance-tuned models of sparse matrix-vector products and sparse triangular solutions on the IBM SP Power 3 processor nodes at NERSC (Figure 3). The model estimates the best possible performance for a computer’s memory system. This translates to 15–20% of the processor’s peak performance, a good gain over previous codes.

Volume-rendered image showing surface of maximum heat release  
Figure 4    Megaflop rate per processor (cubic grids, nested dissection).  

Amestoy et al. conducted a comprehensive study and comparison of two state-of-the-art direct solvers for large sparse sets of linear equations on large-scale distributed-memory computers. One is a multifrontal solver called MUMPS, the other is a supernodal solver called SuperLU. The authors described the main algorithmic features of the two solvers and compared their performance characteristics with respect to uniprocessor speed, interprocessor communication, memory requirements, and scalability (Figure 4). They found that both solvers have strengths and weaknesses.


INVESTIGATORS
D. Keyes, Old Dominion University; B. Smith and J. More, Argonne National Laboratory; E. G. Ng, Lawrence Berkeley National Laboratory; R. Falgout, Lawrence Livermore National Laboratory; J. W. Demmel, University of California, Berkeley; O. Ghattas, Carnegie Mellon University; O. Widlund, New York University; S. McCormick, University of Colorado, Boulder; J. Dongarra, University of Tennessee.

PUBLICATIONS
R. Vuduc, J. W. Demmel, K. A. Yelick, S. Kamil, R. Nishtala, and B. Lee, “Performance optimizations and bounds for sparse matrix-vector multiply,” Proc. of the IEEE/ACM Conference on Supercomputing, 2002.

P. R. Amestoy, I. S. Duff, J.-Y. L’Excellent, and X. S. Li, “Analysis and comparison of two general sparse solvers for distributed memory computers,” ACM Transactions on Mathematical Software 27, 388 (2001).

URL
http://www.tops-scidac.org/

 
NERSC Annual Report 2002 Table of Contents Science Highlights NERSC Center