Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Task of the Data Processing on-call expert

...

  • Send an email to unix-admin (this usually works, even in the night and in the weekend).
  • If you don't get an answer and the issue is urgent, call xHELP (650-926-4357). Choose 3 to page the on-call person.
  • If your call/email isn't answered and the issue is REALLY urgent, page again the on-call person at 650-926-2230.

Log Watcher Messages

 


Message text: Can't open lockfile /nfs/farm/g/glast/u41/L1/r0248039911/r0248039911.lock.

* The monitoring shifter doesn't need to do anything about it. The L1 shifter should figure out why it happened. In this case, it was because I rolled back a merge job in a run that was already done. (Warren) 


cancelProcessInstance

Here's the syntax to cancel a process that is not in final state, and all its dependencies (might be useful when you don't want to wait for it to finish before rolling something back.  However, it's usually faster to wait as the cancel can take a long time):
/afs/slac.stanford.edu/u/gl/glast/pipeline-II/dev/pipeline --mode PROD cancelProcessInstance 8073657

...

Here's a little script that is useful for monitoring: 


#!/bin/bash

lines=$1

export LSB_BJOBS_FORMAT="id name:15 user submit_time stat:5 exec_host:10 start_time mem cpu_used"

...

busers -w glast glastmc glastraw

bqueues -w glastdataq

 

 



Shift sign up

Sign up for shifts here. View shift calendar here

...

...