Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

When the servers are idle, idle threads should be 122. The SLAC IT people consider it a warning if it goes below 110 and an error at 100. I usually start thinking about taking action if it stays below 60 for more than a few minutes (although afs112 has been known to go as low as 40, since PipelineStanging6 is for the HalfPipe). This is likely to occur if there are more than ~300 chunk jobs running. Usually after recon finishes and the chunk-level jobs downstream of recon start up. I've written a script that will suspend jobs that are using a specified server, wait a bit, and then resume them with a little delay between:

...