1) who is responsible for batch job failures?
2) It would be helpful to have steps to take to diagnose a given problem, for instance if the pipeline is running slowly how do you decide to reboot the tomcat server it is on?
3) Oracle email notification, for example: "GLAST-ORACLE **** Data Guard problem on glast-oracle03"
How do you read the notification, what, if anything, should be done?

Just a comment: It is very difficult to know who is responsible for what when problems occur. Is there a general list of GLAST processes, the group that controls the process and people in the group?

  • No labels