Overview
There are 2 mpi queues for parallel computing at SLAC, bulletmpi and bulletmpi-large. They are available to anyone with a SLAC unix account, but we monitor these queues a bit more closely and you will need to send email to unix-admin to request access to these queues. Jobs submitted to these queues use some of the same batch hosts that are in use by the general farm.
- bulletmpi allows jobs to request between 8 and 512 cores
- bulletmpi-large allows jobs between 513 and 2048 cores
Single slot jobs are not allowed in these queues. There is also a limit on the total number of cores in use by the bulletmpi and bulletmpi-large queues. You can see what that limit is by running the command blimits which shows that there are 3072 slots total available to those 2 queues and at the moment of the command they are all in use:
renata@victoria $ 13:41 blimits -w
INTERNAL RESOURCE LIMITS:
NAME USERS QUEUES HOSTS PROJECTS SLOTS MEM TMP SWP JOBS
bulletmpi_total_limit - bulletmpi bulletmpi-large bulletfarm/ - 3072/3072 - - - -
bulletmpi_slot_limit hezaveh bulletmpi - - 288/512 - - - -
bulletmpi_slot_limit lehmann bulletmpi - - 128/512 - - - -
bulletmpi_slot_limit sforeman bulletmpi - - 256/512 - - - -
bulletmpi_slot_limit frubio bulletmpi - - 32/512 - - - -
bulletmpi_slot_limit weast bulletmpi - - 300/512 - - - -
bulletmpi_slot_limit cuoco bulletmpi - - 20/512 - - - -
bulletmpi_long_slot_limit shoeche bulletmpi-large - - 2048/2048 - - - -
There are 2 flavors of mpi available at SLAC, the stock RedHat rpm version, and the RedHat version compiled with LSF hooks. They each use a slightly different bsub command to submit jobs.
Stock MPI
This version of mpi is the default on most of the public login machines. You can tell that this is the version that you will be running if you get the following response:
renata@rhel6-64f $ 15:38 which mpirun
/usr/lib64/openmpi/bin/mpirun
A busb command which uses this mpi should look like:
bsub -q bulletmpi -a mympi -n <# cores> <mpi job>
MPI with LSF hooks
This version of mpi is available to bsub when you ssh to bullet which will log you in to either bullet0001 or bullet0002 (as described earlier, submit email to unix-admin to request access to the bulletmpi queue(s)). The response to which mpirun in this case should look like:
/opt/lsf-openmpi/1.5.4/bin//mpirun
You can also check that the version of mpi with LSF hooks is loaded by running:
renata@bullet0002 $ module list
Currently Loaded Modulefiles:
1) lsf-openmpi_1.5.4-x86_64
A bsub command which uses this mpi should look like:
bsub -q bulletmpi mpirun -n <# cores> <mpi job>