Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Task of the Data Processing on-call expert

Monitor L1Proc and halfPipe. : Every time a red cross appears on the Data Processing page, next to the L1Proc or halfPipe processing status bar, the Level 1 on-call expert will be paged. We are not on-call for ASP/GRB search (Jim Chiang (jchiang{at}slac) should be emailed (not paged) for these failures) and we are definitely NOT on-call for infrastructure problems (can't see monitoring plots, etc.). If you get paged for something that is not under your responsibility, don't try to fix it: forward the message to the appropriate people and report everything in the Ops Log.

Watch the Usage Plots and look for L1Proc/HalfPipe related tasks (doChunk, doCrumb, etc). Default rule of thumb of time is to use last 2 hours, because more than that will not give enough fidelity in the plot. If you see a series of points that make a flat line for an extended period of time, it may indicate problems with the pipeline.

Different types of failures

...