National Energy Research Scientific Computing Center 2004 Annual Report

Navigation
Science-Driven Computing
NERSC is continuously reassessing its approach to supporting high-end scientific computing, and from time to time undertakes a major reevaluation and realignment. In 2005 this involved several activities: the NERSC Users’ Group writing and publishing the latest DOE Greenbook, the development of NERSC’s five-year plan for 2006–2010, and a programmatic peer review conducted for the DOE. The overall theme of this evaluation and planning effort was Science-Driven Computing.
DOE Greenbook Published
The DOE Greenbook: Needs and Directions in High-Performance Computing for the Office of Science was compiled by Steve Jardin for the NERSC Users Group and published in June 2005 (Figure 1). With contributions from 37 scientists from a variety of disciplines and organizations, this report documents the computational science being done at NERSC and other DOE computing centers and provides examples of computational challenges and opportunities that will guide the evolution of these centers over the next few years.
According to the Greenbook, researchers in all of the disciplines supported by the Office of Science are finding that large-scale computational capabilities are now essential for the advancement of their research. Today’s most powerful computers and scientific application codes are being used to produce new and more precise scientific results at the cutting edge of each discipline, and this trend is destined to continue for years to come. The Greenbook presents many examples of the impact of large-scale computations on the sciences.
However, the Greenbook points out that the current computational resources available through NERSC are saturated, and the lack of additional computing resources is becoming “a major bottleneck in the scientific research and discovery process.” The report advises, “A large increase in computer power is needed in the near future to take the understanding of the science to the next level and to help secure the U.S. DOE SC leadership role in these fundamental research areas.”
Figure 1. The DOE Greenbook, prepared by the NERSC Users Group, is available online.
The Greenbook’s specific recommendations include:
- Expand the high performance computing resources available at NERSC, maintaining an appropriate system balance to support the wide range of large-scale applications involving production computing and development activities in the DOE Office of Science.
- Configure the computing hardware and queuing systems to minimize the time-to-completion of large jobs, as well as to maximize the overall efficiency of the hardware.
- Actively support the continued improvement of algorithms, software, and database technology for improved performance on parallel platforms.
- Significantly strengthen the computational science infrastructure at NERSC that will enable the optimal use of current and future NERSC supercomputers.
- Carefully evaluate the requirements of data- or I/O-intensive scientific applications in order to support as wide a range of science as possible.
NERSC’s Five-Year Plan
With guidance from the DOE Greenbook and other interactions with the NERSC user community, as well as monitoring of technology trends, NERSC has developed a five-year plan focusing on three components: Science-Driven Systems, Science-Driven Services, and Science-Driven Analytics (Figure 2). NERSC management and staff have observed three trends that need to be addressed over the next several years:
- the widening gap between application performance and peak performance of high-end computing systems
- the recent emergence of large, multidisciplinary computational science teams in the DOE research community
- the flood of scientific data from both simulations and experiments, and the convergence of computational simulation with experimental data collection and analysis in complex workflows.
Figure 2. Conceptual diagram of NERSC’s plan for 2006–2010.
NERSC’s responses to these trends are the three components of the science-driven strategy that NERSC will implement and realize in the next five years:
- Science-Driven Systems: Balanced introduction of the best new technologies for complete computational systems—computing, storage, networking, visualization and analysis—coupled with the activities necessary to engage vendors in addressing the DOE computational science requirements in their future roadmaps.
- Science-Driven Services: The entire range of support activities, from high-quality operations and user services to direct scientific support, that enable a broad range of scientists to effectively use NERSC systems in their research. NERSC will concentrate on resources needed to realize the promise of the new highly scalable architectures for scientific discovery in multidisciplinary computational science projects.
- Science-Driven Analytics: The architectural and systems enhancements and services required to integrate NERSC’s powerful computational and storage resources to provide scientists with new tools to effectively manipulate, visualize, and analyze the huge data sets derived from simulations and experiments.
This balanced set of objectives will be critical for the future of the NERSC Center and its ability to serve the DOE scientific community. Elements of this strategy that are currently being implemented are discussed in the following pages. The full five-year plan can be read at http://www.nersc.gov/news/reports/LBNL-57582.pdf.
DOE Review of NERSC
On May 17–19, 2005, a Programmatic Review of NERSC was conducted for the Department of Energy. The peer review committee was chaired by Frank Williams of the Arctic Region Supercomputing Center at the University of Alaska, Fairbanks. Other members were Walter F. Brooks of the Advanced Supercomputing Division at NASA Ames Research Center, Lawrence Buja of the National Center for Atmospheric Research, Cray Henry of the Defense Department’s High Performance Computing Modernization Program, Robert Meisner of the National Nuclear Security Administration, José L. Muñoz of the National Science Foundation, and Tomasz Plewa of the Center for Astrophysical Thermonuclear Flashes at the University of Chicago.
In addition to reviewing the DOE Greenbook and NERSC’s five-year plan, the review panel heard presentations and engaged in conversations covering all aspects of NERSC’s operations. The DOE had requested that the panel address a number of specific topics, but they were also given the freedom to look into any aspect of NERSC and to comment accordingly. The panel responded by presenting a detailed list of findings and recommendations to DOE and NERSC managers, who are now using those findings to improve NERSC’s operations.
The overall conclusions of the review committee included a strong endorsement of NERSC’s approach to enabling computational science:
NERSC is a strong, productive, and responsive science-driven center that possesses the potential to significantly and positively impact scientific progress by providing users with access to high performance computing systems, services, and analytics beneficial to the support and advancement of their science….
Members of the review panel each report that NERSC is extremely well run with a lean and knowledgeable staff. The panel members saw evidence of strong and committed leadership, and staff who are capable and responsive to users’ needs and requirements. Widespread, high regard for the center’s performance, reflected in such metrics as the high number of publications supported by NERSC, and its potential to positively impact future advancement of computational science, warrants continued support.
Organizational Changes
In order to implement the new initiatives and the changes in emphasis derived from the planning and review process, in November 2005 NERSC announced several organizational changes, including two new associate general managers, two new teams, and a new group.
“In order to efficiently carry out our plan and meet the expectations of our users and sponsors, we are modifying the NERSC Center organization,” General Manager Bill Kramer wrote in announcing the changes. “In addition to the Division, Department and Group components of the organization, we will have two other components: Functional Areas and Teams.”
NERSC has created two functional areas—Science-Driven Systems and Science-Driven Services. The majority of the NERSC staff will work in these two areas (Figure 3). The functional areas are responsible for carrying out the responsibilities and tasks discussed in the respective sections of NERSC’s five-year plan. Functional areas will be led by Associate General Managers (AGMs), who are responsible for coordinating activities across the groups and teams in their areas. Francesca Verdier is associate general manager for Science-Driven Services, and Howard Walter is associate general manager for Science-Driven Systems (Figure 4).
Figure 3. NERSC’s new organization reflects new priorities and promotes coordination across groups and teams.
Figure 4 (left). NERSC’s new associate general managers, Francesca Verdier and Howard Walter, are responsible for Science-Driven Services and Science-Driven Systems, respectively. Figure 5 (right). Wes Bethel heads NERSC’s new Analytics Team, while John Shalf heads the new Science-Driven Systems Architecture Team.
The Accounts and Allocations Team, the Analytics Team, the Open Software and Programming Group, and the User Services Group will report to the Science-Driven Services AGM. The Computational Systems Group, the Computer, Operations and ESnet Support Group, the Mass Storage Group, and the Networking, Security and Servers Group will report to the Science-Driven Systems AGM.
The reorganization includes the creation of one new group and two new teams. They are:
- Analytics Team: Analytics is the intersection of visualization, analysis, scientific data management, human-computer interfaces, cognitive science, statistical analysis, and reasoning. The primary focus of the Analytics Team is to provide visualization and scientific data management solutions to the NERSC user community to better understand complex phenomena hidden in scientific data. The responsibilities of the team span the range from applying off-the-shelf commercial software to advanced development to realizing new solutions where none previously existed. The Analytics Team is a natural expansion of the visualization efforts that have been part of NERSC since it moved to Berkeley Lab. Wes Bethel is the team leader (Figure 5). NERSC’s analytics strategy is discussed in more detail beginning here.
- Open Software and Programming (OSP) Group: The growing use of open-source software and partially supported software requires a change of approach to NERSC’s needs for the future. These areas now are a key component of NERSC’s ability to provide high-quality systems and services. This group is responsible for the support and improvement of open-source and other partially supported software, particularly the software that NERSC uses for infrastructure, operations, and delivery of services. Key efforts include open-source engineering, development and support of middleware (Grid and Web tools), and NERSC’s software infrastructure. Francesca Verdier is the acting group leader until a permanent one is recruited.
- Science-Driven System Architecture (SDSA) Team: This team performs ongoing evaluation and assessment of technology for scientific computing. The SDSA Team has expertise in benchmarking, system performance evaluations, workload monitoring, use of application modeling tools, and future algorithm scaling and technology assessment. Using scientific methods, the team will develop methods for analyzing possible technical alternatives and will create a clear understanding of current and future NERSC workloads. The SDSA Team will engage with vendors and the general research community to advocate technological features that will enhance the effectiveness of systems for NERSC scientists. The team is responsible for ongoing management of a suite of benchmarks that NERSC and Berkeley Lab use for architectural evaluation and procurement. This includes composite benchmarks and metrics such as SSP, ESP, variation, reliability, and usability. The team will matrix staff from both NERSC and Berkeley Lab’s Computational Research Division for specific areas such as algorithm tracking and scaling, which are designed to develop and document future algorithmic requirements. The scientific focus for this effort will change periodically and will start with applied mathematics and astrophysics. The SDSA Team leader is John Shalf (Figure 5). Their current activities are discussed here.
Figure 6. Promoted to group leaders were Brent Draney of the Networking, Security and Servers Group and Jonathan Carter of the User Services Group.
Completing the reorganization, Jonathan Carter succeeds Francesca Verdier as leader of the User Services Group, and Brent Draney succeeds Howard Walter as leader of the Networking, Security and Servers Group (Figure 6).