|
Untitled Document
PETASCALE SYSTEMS INTEGRATION INTO LARGE SCALE FACILITIES WORKSHOP
Agenda:
DAY ONE: May 15, 2007 – 8 am start
First Session – Plenary
Introduction and Logistics – Bill Kramer/Yeen Mankin
Welcome – Dan Hitchcock
Motivation for the Workshop – Bill Kramer
System Integration at LLNL – Mark Seager
System Integration at NCAR – Tom Bettge
Break
All breakout session chairs will be asked to report back to the plenary on the following issues:
- What are the major challenges in this area?
- What methods and technology are currently being used and how do we use them
- What methods and technology work and which ones do not
- What tools and technology do we wish have – particularly for Petascale systems
- Other observations/suggestions/issues
Second Session – Breakouts
Breakout 1 – Integration Issues for Facilities
– Petascale systems are pushing the limits of facilities in terms of
space, power, cooling and even weight. There are many complex
issues to deal with when integrating large scale systems and these will
get more challenging with Petascale systems. While we all hope
technology will reverse these trends, can we count on it? Besides
building large facilities (at Moore’s law rates) how can we better
optimize facilities? How can the lead times and costs for site
preparation be reduced? Can real time adjustments be made rather
than over design?
Breakout 1 Leaders - Howard Walter, Gary New (Steve Lowe)
Breakout 2 – Performance Assessment of Systems
– There are many tools and benchmarks that help assess performance of
systems, ranging from single performance kernels to full
applications. Performance tests can be kernels, specific
performance probes and composite assessments. What are the most
effective tools? What scale tests are needed to set system
performance expectations and to assure system performance? What
are the best combinations of tools and tests?
Breakout 2 Leaders – Tom Engel (NCAR), Rob Pennington (NCSA) (H Wassermann)
Breakout 3 – Methods of Testing and Integration
– There are a range of methods for fielding large scale systems,
ranging from self integration, cooperative development, factory
testing, and on-site acceptance testing. Each site and system has
different goals and selects from the range of methods. When are
different methods appropriate? What is the right balance between
the different approaches? Are there better combinations than
others?
Breakout 3 Leaders – Brad Comes (DOD), Buddy Bland (ORNL) (N Cardo/F Verdier)
Third Session – Plenary
Reports from breakouts
Panel – The Vendor Side of Deployments – TBD (Cray), Chulho Kim (IBM),
Renato Ribeiro (Sun), Dave Sundstrom (Linux Networx)
Petascale HPC Deployments: Sun's Perspective, Renato Ribeiro, Ph.D., Manager, Integrated Systems Marketing
Working Dinner
Panel – If only I had known! The biggest blunders/mistakes and humorous
experiences in large system deployments (All)
DAY TWO: May 16, 2007 – 8 am start
Fourth Session – Breakouts
Breakout 4 – Systems and User Environment Integration Issues
– Breakout session #2 looked at performance and benchmarking
tools. While performance is one element of successful systems, so
are effective resource management, reliability, consistency and
usability, to name a few. Other than performance, what other
areas are critical to successful integration? How are these
evaluated?
Breakout 4 Leaders - Mike McCraney (MHPCC), TDB (T Davis)
Breakout 5 - Early Warning signs of problems
– detecting and handling – Fielding large scale systems is a major
project in its own right, and takes cooperation between site staff,
stakeholders, users, vendors, third party contributors and many
more. How can early warning signs of problems be detected?
When they are detected, what should be done about them? How can
they be best handled to have the highest and quickest success?
How do we insure long-term success vs the pressure of quick milestone
accomplishment? Will the current focus on formal project
management methods help or hinder?
Breakout 5 Leaders - Bob Tomlinson (LANL), Jim Kasdorf (PSC) (R Gerber)
Breakout 6 – How to keep systems running up to expectations
– Once systems are integrated and accepted, is the job done? If
systems pass a set of tests, will they continue to perform at the level
they start at? How can we assure systems continue to deliver what
is expected? What levels and types of continuous testing are
appropriate?
Breakout 6 Leaders - Dave Skinner (NERSC), Kevin Regimbal (PNNL) (T Butler)
Fifth Session – Plenary
Reports from breakouts
Panel Session – How will Petascale systems change what we have been
doing? – Ray Bair (ANL), Phil Andrews (SDSC), Brad Comes (DODMod)
How Should Petascale Systems Change what we are Doing?, Ray Bair, Director, Agronne Leadership Computing Facility.
How will Petascale systems change what we have been doing?, Phil Andrews, Patricia Kovatch, SDSC
DoD HPC Modernization Program, Bradley Comes
Sixth Session – Plenary
Report Summary
Conclusion
Workshop adjourns – 5 pm on May 16
|