sp-announcements mailing list archive
Index Using standard input on Seaborg

From: David Turner (dpturner_at_lbl_dot_gov)
Date: 09/13/2006

  • Next message: Francesca Verdier: "Seaborg status"
    Greetings Seaborg User,
    
    This message describes a workaround for certain job failures on Seaborg.
    The workaround also can yield slight performance improvements for any
    parallel code that reads from standard input ("stdin").
    
    In a batch job, there typically are three ways that a parallel program
    (called "a.out" in the following examples) can read from stdin:
    
    1) Input redirection.  Example:
    
    ./a.out < my_input_file
    
    2) Pipe.  Example:
    
    cat my_input_file | ./a.out
    
    3) "Here document".  Example:
    
    ./a.out << EOF
    data
    data
    data
    EOF
    
    In all these cases, IBM's Parallel Environment (PE) will arrange for
    _all_ tasks of the parallel program to get access to the standard input
    stream.  In the best case, this adds a small amount of overhead to
    each task; in the worst case, some larger programs have been failing with
    "pulse timeout" error messages.
    
    It has been our experience that most (but not all) parallel programs that
    read stdin do so in only one task.  This is typically the "master" task,
    identified as MPI rank 0.  After reading the input, the master computes
    some derived quantities, and then distributes data to the remaining tasks.
    In this common model, there is no reason for all the tasks to have access
    to stdin.  If your application fits this model, you should use the following
    environment variable in your batch scripts:
    
    setenv MP_STDINMODE 0 (csh/tcsh)
    export MP_STDINMODE=0 (sh/bash/ksh)
    
    The above settings will bind stdin to task 0; other tasks will not have access
    to stdin.  NOTE:  if you use this setting, but all your tasks really _do_
    need access to stdin, your program will hang indefinitely.
    
    IBM is working on a solution to the "pulse timeout" problem, and we expect
    to install it on Seaborg when it becomes available.  However, we believe the
    above settings will always be appropriate for the large majority of jobs on
    Seaborg, even after this fix is installed.
    
    If you have any questions about this issue, please contact NERSC Consulting at:
    
         1-800-66-NERSC, menu option 3, 8 am - 5 pm, Pacific time
         (510) 486-8600, menu option 3, 8 am - 5 pm, Pacific time
         consult_at_nersc_dot_gov
         http://help.nersc.gov/
    
    -- 
    Best regards,
    
    David Turner
    User Services Group        email: dpturner_at_lbl_dot_gov
    NERSC Division             phone: (510) 486-4027
    Lawrence Berkeley Lab        fax: (510) 486-4316
    

  • Next message: Francesca Verdier: "Seaborg status"