NERSC Bassi update

From: Richard Gerber (ragerber_at_lbl.gov)
Date: 12/21/2007


Dear Bassi users,

Holiday Greetings! We hope Bassi continues to be a productive machine 
for you. The system was
recently in uninterrupted service for more than three months before last 
week's brief system reboot.

We recycled the machine on Dec. 13 to clear "stuck" memory segments on 
many of the nodes.
Some memory on the affected nodes was unavailable to your applications. 
While most codes were
probably unaffected, those with extreme memory bandwidth requirements, 
or those trying to use
all available memory, may have had problems. All nodes are back to 
normal and we are
working with IBM to keep this from happening again.

This issue was identified by NERSC's performance monitoring efforts, the 
results of which are
available online at 
https://www.nersc.gov/nusers/systems/bassi/monitor.php. Examine the 
"memrate"
results to see how an extremely memory intensive routine was affected by 
the recent problems.

We have been saving memory "snaphots" of each node every 15 minutes and 
this data
for each job is available from the NERSC completed jobs page at 
https://www.nersc.gov/nusers/status/jobs/index.php
If your job ran for 30 minutes or more you can click on the Job ID 
(Bassi jobs only) and get a look at
how your job used memory on each node (not each MPI task) as a function 
of time. If you ran using the
NERSC IPM performance utility, the memory snapshots will appear on the 
same page with your IPM results.
We hope these web pages automatically provide you with valuable 
information about your run with little to no
effort on your part. If you are unfamiliar with IPM, please see 
http://www.nersc.gov/nusers/resources/software/tools/ipm.php

Finally, if you are moving from Seaborg to Bassi, please review the 
Quick Start Guide for Seaborg Users at
http://www.nersc.gov/nusers/systems/bassi/quick.php

Regards,
Richard Gerber

-- 
Richard Gerber, Ph.D.                      ragerber@lbl.gov  
NERSC                                      phone: 510-486-6820
Lawrence Berkeley National Lab             fax:   510-486-4316 
Berkeley, CA 94720

This archive was generated by hypermail 2.1.6 : 08/21/2008 PDT