(08:19:16) Tom: xlong time limit is 3600 minutes of "SLAC time"
(08:19:36) Tony Johnson: OK, how do you find that out?
(08:20:00) Tom: On a machine registered to run LSF, "bqueues -l xlong"
(08:20:48) Tom: To convert from "SLAC time" to wall-clock time, divide by the CPU Factor
(08:20:59) Tom: You can see a list of these factors using the "lsinfo" command
(08:21:22) Tony Johnson: Thanks
(08:23:08) Tom: The remaining puzzle is figuring out the "MODEL_NAME" <-> farm node name correspondence
(08:23:15) Tom: (if you need that)
(08:25:41) Tony Johnson: Do you know what the difference is between CPULIMIT and RUNLIMIT. Is the later total elapsed time?
(08:26:22) Tom: Yes. If a job starts running and then, say, goes to sleep for 8 hours, without taking up any CPU time at all, it will still time out if RUNLIMIT is exceeded.
(08:26:58) Tony Johnson: OK, so I assume they only schedule one job per CPU
(08:27:54) Tom: Yes, and that is a setable parameter. If you use the "bqueues" command (without options), you will see a column called "JL/P" which is JobLimit per Processor
(08:28:05) Tom: It is "1" for all production queues
(08:28:27) Tony Johnson: OK, thanks

  • No labels