...
We need to have a meeting, perhaps on mondayMonday, to review the status and lessons learned, and how to get (back) to a "production" level. Some topics:
...
- Are there problems with the way we have our VMs installed/configured
- Do we have things distributed among VMs in an optimal way (I think we have too much on scalnx-v01 for example)
- Is our documentation on what is running where complete and correct?
- Do we have documentation that would allow people at BNL to diagnose and fix some problems?
- Do we have nagios Nagios configured optimally
- Do we need servers-monitoring set up for non-Fermi machine
- Do we need to put things like login, group manager, etc under CCB
- Database monitoring
...
-
Update Server Locations and Functions page Brian Van Klaveren
Check that Nagios is monitoring all servers and web applications Charlotte Hee
Set up server monitoring page for non-Fermi servers Massimiliano Turri (http://srs.slac.stanford.edu/ServerMonitoring/exp/LSST-CAMERA/servers.jsp) link added under Developer menu tab.
Add name of VM on which tomcat server is running Massimiliano Turri
Move web applications back to HA machine lsstlnx-v01 Tony Johnson
- Check that Ganglia is monitoring virtual and physical servers Charlotte Hee
- If they have been monitored: check plots on load for physical machine Charlotte Hee
- Make sure scalnx-v01 has production only processes Tony Johnson
- Follow up with Arash on database monitoring, backups, logs, configurations Joanne Bogart
- Follow up with Yemi on VM problems Tony Johnson
- Improve and coordinate documentation Kelly (Arrighi), Heather
- Ask Yemi about status of GroundWorks or similar infrastructure Charlotte Hee
- Storage on HA file server Tom Glanzman
- IIS problem: is more investigation required?
- Get scalnx10-vmm added to sca ganglia. Charlotte Hee