Confluence will be down for maintenance June 14 2024 at 6AM PT.
...
Only one delivery can process a run at a time. This is enforced by a lock file in the run directory on u52/L1. If there are permanent failures in the run and another part of the run is waiting, it has to be removed by hand. It should never be removed unless the only failures in the run are permanent ones, or there's a deadlock. Even then you have to wear a helmet and sign a waiver.
When the AFS servers where we keep temporary files hiccup, it's usually because they ran low on idle threads. It is possible to monitor this value and intervene to stave off disaster. Unfortunately, it's only available from Nagios, which only works inside SLAC's firewall.
...