Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Among the jobs that were launched after 2:30 and failed was findChunks, which had moved some of the evt files. The automatic retry moved the rest. Then some of the digi and fastMon jobs couldn't find their inputs. This was a little trickier than the usual "move them back" scenario, because their locations in xroot had to be gathered from 2 log files. So I did that and rolled back findChunks and it all looks OK to me now.

 

Throttling the Pipeline

If you ever need to limit the amount of work being done on the pipeline (like we wanted to with the LAT restart in April 2018), you can manually create throttle locks to limit the number of simultaneous runs that can be worked on at a time.  Right now the pipeline is set to allow up to 6 runs to be worked on at once.  If you want to limit that, simply create lock files in the /nfs/farm/g/glast/u41/L1/throttle directory of the form 0.lock, 1.lock, ... up to 5.lock.  The contents can be anything you want.  It is just the presence of the file that stops things from running.  Each lock file created will reduce the number of simultaneous runs by one.  Creating all six will stop the pipeline from processing anything.

 

Changing the Reaper Settings

There are some variables found on the JMX page of the pipeline that control the operation of the reaper (the process that looks for dead process, cleans them up, and restarts them).

The first is the ReaperFrequencySeconds in the Control section at the top.  This parameter controls how often the reaper actually runs.  Currently set to 60 (once per minute).

The other parameter is set on a per submission source basis.  This is the ReaperDelayMinutes parameter and is found in each of the later sections on that page.  This controls how long a process has to be dead for before the reaper kills it.  It is typically set to 60 or 120 minutes.  NOTE (from Warren): I'm not convinced that ReaperDelayMinutes actually does anything. Restarting the pipeline set both of them back to default.