|
NERSC Announcements Message Archive
Select:
[all-announcements]
[users]
[franklin]
[bassi]
[jacquard]
[davinci]
[nug]
[managers]
[ Back ]
Subject: |
Using standard input on Seaborg |
Author: |
David Turner <dpturner_at_lbl.gov> |
Date: |
2006-09-13 16:52:31 |
Greetings Seaborg User,
This message describes a workaround for certain job failures on Seaborg.
The workaround also can yield slight performance improvements for any
parallel code that reads from standard input ("stdin").
In a batch job, there typically are three ways that a parallel program
(called "a.out" in the following examples) can read from stdin:
1) Input redirection. Example:
./a.out < my_input_file
2) Pipe. Example:
cat my_input_file | ./a.out
3) "Here document". Example:
./a.out << EOF
data
data
data
EOF
In all these cases, IBM's Parallel Environment (PE) will arrange for
_all_ tasks of the parallel program to get access to the standard input
stream. In the best case, this adds a small amount of overhead to
each task; in the worst case, some larger programs have been failing with
"pulse timeout" error messages.
It has been our experience that most (but not all) parallel programs that
read stdin do so in only one task. This is typically the "master" task,
identified as MPI rank 0. After reading the input, the master computes
some derived quantities, and then distributes data to the remaining tasks.
In this common model, there is no reason for all the tasks to have access
to stdin. If your application fits this model, you should use the following
environment variable in your batch scripts:
setenv MP_STDINMODE 0 (csh/tcsh)
export MP_STDINMODE=0 (sh/bash/ksh)
The above settings will bind stdin to task 0; other tasks will not have access
to stdin. NOTE: if you use this setting, but all your tasks really _do_
need access to stdin, your program will hang indefinitely.
IBM is working on a solution to the "pulse timeout" problem, and we expect
to install it on Seaborg when it becomes available. However, we believe the
above settings will always be appropriate for the large majority of jobs on
Seaborg, even after this fix is installed.
If you have any questions about this issue, please contact NERSC Consulting at:
1-800-66-NERSC, menu option 3, 8 am - 5 pm, Pacific time
(510) 486-8600, menu option 3, 8 am - 5 pm, Pacific time
consult@nersc.gov
http://help.nersc.gov/
--
Best regards,
David Turner
User Services Group email: dpturner@lbl.gov
NERSC Division phone: (510) 486-4027
Lawrence Berkeley Lab fax: (510) 486-4316
|
|