Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

All SLAC users can run parallel jobs on the shared "bullet" shared cluster. It has 5024 cores. The hardware Each cluster node is configured as follows:   

  • RHEL6 64bit OS x86 nodes

  • 2.2GHz Sandy Bridge CPUs

  • 16 cores per node

  • 64GB RAM per node

  • QDR (40Gb) Infiniband for MPI comms

  • 10Gb ethernet for SLAC networking

If your MPI or mulitcore job only needs <= 16 cores and the memory requirement fits on a single host, you should submit to the general queues. Use the following syntax to run an 8-core job on a single host:

bsub -n 8 -R "span[hosts=1]” -W <runlimit> <executable>

More information on the general queues: https://confluence.slac.stanford.edu/display/SCSPub/High+Performance+Computing+at+SLAC

For parallel jobs that require multiple hosts, there There are 2 public queues for MPI computing on the bullet cluster, : bulletmpi and bulletmpi-large. They are available to anyone with a SLAC unix account.  Jobs submitted to bulletmpi and bulletmpi-large will reserve entire hosts and run on these hosts exclusively. Please send email to unix-admin admin@slac.stanford.edu to request access to these queues.

 

  • bulletmpi for jobs between 8 and 512 cores
  • bulletmpi-large for jobs between 513 and 2048 cores

Queuemin. # coresmax. # coresdefault runtimemax. runtime
bulletmpi851215 mins7 days
bulletmpi-large513204815 mins1 day

 

Single slot jobs are not allowed in these queues. You should specify the wallclock runtime using the -W <minutes> or -W <hours:minutes> bsub arguments. There is also a limit on the total number of cores (slots) in use by the bulletmpi and bulletmpi-large queues. You can check the current slot usage and the slot limits by running the blimits command. The output below shows the combined slot total for bulletmpi and bulletmpi-large is limited to 3072 slots. All 3072 slots are in use:

...

There is also an earlier version of OpenMPI, 1.5.4,  available, but OpenMPI 1.8.1 has the advantage of being able to run on hosts at different infiniband speeds.  OpenMPI 1.5.4 can have communication problems if it attempts to run across hosts at different speeds.  However, If you find a need for that the older version you can set up your environment to use it instead when running on the bullets by making the following changes to the appropriate login script;  csh or tcsh users will update .cshrc and bash users will update .bash_profile or .bashrc:

...