Frequently asked questions about infrastructure shifts

These are some questions that came up while perusing the How-To-Fix pages, in preparation for taking shifts. Feel free to respond to the questions or add new questions.

  • When nagios reports disks becoming full how is more disk space allocated and by whom?
    • This depends on what type of disk
      • Local disk on glastlnx* - Contact Navid to cleanup
      • NFS disk - see this list of which disk is used for what
      • AFS disk - if the error message is 'disk not responding' send email to unix-admin@slac.stanford.edu
      • xrootd disks - contact Wilko, or Richard for more $$$
  • Nagios and Ganglia show a lot of status information. How are newbie shift takers suppose to use that information?
    • The only page you would be likely to look at regularly is the page which lists current problems. The other pages are used only for troubleshooting.
    • There is a table with the heading 'Web Applications'. If you click that it a list of applications under the 'Service' column along with server and status information.
  • Is there a check list that Infrastructure shift takers need to go over at the beginning/end of each shift?
    • Not currently
  • No labels