Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

SLAC batch resources consist of several generation of hardwares. They are listed at the the shakeholder's priority page. Some of the batch nodes run RHEL 6 operation system, while others run CentOS 7 operation system. Singularity container technology is available on the CentOS 7 batch nodes.

  • To run your job on a RHEL 6 batch node only, use:  bsub -R "select[rhel6]" ...

  • To run your job on a CentOS 7 batch node only, use: bsub -R "select[centos7]" ...

...

Of course, the more resource you ask, the harder to schedule the jobs, and hence the pending time will be longer.

Here is a more complex example of selection resource for you to pick and choose from:

  •  -R "select[ ! preempt && rhel60 & cvmfs && inet && bullet] rusage[scratch=5.0:duration=1440:decay=1, mem=2000:decay=0] span[hosts=1]" 

It requests the job be dispatched to a machine where

  1. Your job won't be preempted by someone else's higher priority job
  2. The machine run RHEL6 operating system (we have "rhel60" and "centos7")
  3. The machine should have CVMFS,
  4. and outbound internet connection (the "inet" key word above)
  5. The machine should be part of the "bullet" cluster (Other clusters we have: fell, hequ, dole, kiso, deft and bubble, all run "rhel60" except the last two, which run "centos7")
  6. Reserve 5GB of free space under /scratch, and you job will reserve it for 1440 minutes, and the reserved amount will decay linearly from 100% to 0 during this period.
  7. Reserve 2000MB of RAM, no decay (the default)
  8. span[hosts=1] means if you request more than one batch slots (the -n option above), schedule all of them on one machine.

Note:

  • a. For 6 or 7 to work, the machine should have that amount of resource available at the time the job is dispatched.
  • b. CVMFS cache is usually stored under /scratch/cvmfs2_cache. (This is a way to make sure that there are free space for CVMFS cache so your job won't get error when accessing CVMFS)
  • c. Most SLAC batch users doesn't use 6 or 7, even if they do, they can use more (because the amount specified in 6 or 7 are "reserved", not maximum). So after your job started at a machine, something bad can still happen (run out of memory, /scratch etc.) due to the activities on that machine.

 

Please refer to the LSF document to get familiar with the basic usage of LSF.

...