Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

All SLAC users can run parallel jobs on the shared "bullet" shared cluster. It has 5024 cores. The hardware Each cluster node is configured as follows:   

  • RHEL6 64bit OS x86 nodes

  • 2.2GHz Sandy Bridge CPUs

  • 16 cores per node

  • 64GB RAM per node

  • QDR (40Gb) Infiniband for MPI comms

  • 10Gb ethernet for SLAC networking

If your MPI or mulitcore job only needs <= 16 cores and the memory requirement fits on a single host, you should submit to the general queues. Use the following syntax to run an 8-core job on a single host:

bsub -n 8 -R "span[hosts=1]” -W <runlimit> <executable>

More information on the general queues: https://confluence.slac.stanford.edu/display/SCSPub/High+Performance+Computing+at+SLAC

For parallel jobs that require multiple hosts, there There are 2 public queues for MPI computing on the bullet cluster, : bulletmpi and bulletmpi-large. They are available to anyone with a SLAC unix account. Please .  Jobs submitted to bulletmpi and bulletmpi-large will reserve entire hosts and run on these hosts exclusively. Please send email to unix-admin admin@slac.stanford.edu to request access to these queues.

 

  • bulletmpi for jobs between 8 and 512 cores
  • bulletmpi-large for jobs between 513 and 2048 cores

Queuemin. # coresmax. # coresdefault runtimemax. runtime
bulletmpi851215 mins7 days
bulletmpi-large513204815 mins1 day

 

Single slot jobs are not allowed in these queues. You should specify the wallclock runtime using the -W <minutes> or -W <hours:minutes> bsub arguments. There is also a limit on the total number of cores (slots) in use by the bulletmpi and bulletmpi-large queues. You can check the current slot usage and the slot limits by running the blimits command. The output below shows the combined slot total for bulletmpi and bulletmpi-large is limited to 3072 slots. All 3072 slots are in use:

...