Depending on your data access you may need to submit jobs to a specific farm. This is accomplished by submitting to the appropriate LSF batch queue. Refer to the table below. Multi-core OpenMPI jobs should be run in either the psnehmpiq or psfehmpiq batch queue, see the following section on "Submitting OpenMPI Batch Jobs". Simulation jobs should be submitted to the low priority queues psfehidle and psfehidle.
Experimental Hall | Queue | Nodes | Data | Comments |
---|---|---|---|---|
NEH | psnehq | psana11xx | ana01, ana02 | Jobs <= 6 cores |
| psnehmpiq | psana11xx,psana12xx | ana01, ana02 | OpenMPI jobs > 6 cores, preemptable |
| psnehidle | psana12xx |
| Simulations, preemptable, low priority |
FEH | psfehq | psana13xx | ana11, ana12 | Jobs <= 6 cores |
| psfehmpiq | psana13xx,psana14xx | ana11, ana12 | OpenMPI jobs > 6 cores, preemptable |
| psfehidle | psana14xx |
| Simulations, preemptable, low priority |
LSF (Load Sharing Facility) is the job scheduler used at SLAC to execute user batch jobs on the various batch farms. LSF commands can be run from a number of SLAC servers, but best to use psexport or pslogin. Login first to{{pslogin}} Login first to pslogin
(from SLAC) or to psexport
to{{psexport}} (from anywhere). From there you can submit a job with the following command:
...
No Format |
---|
bsub -q psnehq -R "rusagemem=1024" my_program
|
The RedHat supplied OpenMPI packages are installed on pslogin, psexport and all of the psana batch servers.
The system default has been set to the current version as supplied by RedHat.
No Format |
---|
$ mpi-selector --query
default:openmpi-1.4-gcc-x86_64
level:system
|
Your environment should be set up to use this version (unless you have used RedHat's mpi-selector
script, or your login scripts, to override the default). You can check to see if your PATH
is correct by issuing the commandwhich mpirun
. Currently, this should return /usr/lib64/openmpi/1.4-gcc/bin/mpirun
. Future updates to the MPI version may change the exact details of this path.
In addition, your LD_LIBRARY_PATH
should include /usr/lib64/openmpi/1.4-gcc/lib
(or something similar).
For notes on compiling examples, please see:
http://www.slac.stanford.edu/comp/unix/farm/mpi.html
The following are examples of how to submit OpenMPI jobs to the PCDS psnehmpiq batch queue:
No Format |
---|
bsub -q psnehmpiq -a mympi -n 32 -o ~/output/%J.out ~/bin/hello
|
Will submit an OpenMPI job (-a mympi) requesting 32 processors (-n 32) to the psnehmpiq batch queue (-q psnehmpiq).
No Format |
---|
bsub -q psfehmpiq -a mympi -n 16 -R "span[ptile=1]" -o ~/output/%J.out ~/bin/hello
|
Wiki Markup |
---|
Will submit an OpenMPI job (-a mympi) requesting 16 processors (-n 16) spanned as one processor per host (-R "span\[ptile=1\]") to the psfehmpiq batch queue (-q psfehmpiq). |
No Format |
---|
bsub -q psfehmpiq -a mympi -n 12 -R "span[hosts=1]" -o ~/output/%J.out ~/bin/hello
|
Wiki Markup |
---|
Will submit an OpenMPI job (-a mympi) requesting 12 processors (-n 12) spanned all on one host (-R "span\[hosts=1\]") to the psfehmpiq batch queue (-q psfehmpiq). |
Report status of all jobs (running, pending, finished, etc) submitted by the current user:
Code Block |
---|
bjobs -w -a
|
Report only running or pending jobs submitted by user "radmer":
Code Block |
---|
bjobs -w -u radmer
|
Report running or pending jobs for all users in the psnehq queue:
Code Block |
---|
bjobs -w -u all -q psnehq
|
Kill a specific batch job based on its job ID number, where the "bjobs" command can be used to find the appropriate job ID (note that only batch administrators can kill jobs belonging to other users).
Code Block |
---|
bkill JOB_ID
|
Report current node usage on the two NEH batch farms:
Code Block |
---|
bhosts -w ps11farm ps12farm
|
The following links give more detailed LSF usage information:
PowerPoint presentation describing LSF for LCLS users at SLAC