Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

You can minimize the time it takes for a general queue job to start running by estimating the defining a wall-clock time limit. Instead of explicity selecting a general queue (short, medium, long, xlong, xxl), just provide the RUNLIMIT argument to the bsub command. The syntax is '-W [hour:]minute'. This estimate time limit should be based on a worst-case scenario since LSF will terminate the job if it exceeds the RUNLIMIT value. This is a real-world time measurement since we are not using any kind of normalization. The automatic queue selection feature will place your job in the appropriate general queue, eliminating any guesswork. Some examples:

...

Providing a RUNLIMIT let's the scheduler know what the required time window for your job is. Without an explicit RUNLIMIT, the scheduler can only assume your job will run as long as the default RUNLIMIT for the queue - this default is often far greater than many jobs need! For example, the xlong queue currently has a RUNLIMIT default of 72 hours but queue statistics show the runtime average for jobs in this queue is currently ~2 hours.

Specifying a runtime estimate

The RUNLIMIT parameter may not provide enough flexibility for certain types of event processing. We may expect the majority of jobs in our pipeline to complete on time, but some jobs get hung up and could take longer. We want to avoid having to resubmit longer running jobs that get killed off by the LSF scheduler. One solution is to provide a runtime estimate in addition to the RUNLIMIT. The argument syntax for the runtime estimate is '-We [hour:]minute'. Most jobs should complete with the estimate time value. The estimate value is used for backfill. The scheduler will only terminate a job when it exceeds the RUNLIMIT, but not when it exceeds the estimate. Example of a job with a runtime estimate of 15 minutes and a RUNLIMIT of 1 hour:

bsub -We 15 -W 1:00 processEvent
Job <209332> is submitted to default queue <medium>.