You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 15 Next »

Purpose of this page

  • we are trying to collect information about upcoming Fermi computing outages (disks, oracle, network) to improve planning
  • when planning an outage, please send an email to datalist and write the description here (including requested duration and preferred timeframe)
  • we will try to combine outages as much as possible, in order to maximize uptime for time-critical services (FASTCopy, pipeline, etc.)
  • once the plan is finalized, don't forget to send a message to glast-outage and the collaboration (if applicable)

Upcoming outage requests

Aug 16, 2012 - site wide power outage.

From John: everything except the servers on the generator will go down. Building 50 is supposed to be the first (or one of the first) buildings brought back up. Power goes off at 5:30 am 8/16. We could have power restrored by 6:30am. Bring up would begin after that, most services back in 2-4 hours. NOTE, however, we tentatively plan to start taking machines down at 17:30 the night before (Aug 15). So we are talking about a ~16 hour outage, if things go well.

July 11, 2012

  • 11:00am - 1:00pm: Replacing a bad fan on sulky34. Since that server holds the LAT raw data, FASTCopy ingestion will be stopped about an hour beforehand to let the pipeline drain.
    Also, the remaining databases will be migrated off of glastlnx01/02 onto mysql-node01.

July 9, 2012

  • 9:30 - 11:00am: An internal disk on glast-oracle03, the host of the GLASTP database, is in danger of failing. This outage is to allow for its replacement.

June 12 2012

  • 10am - 11:30am: migrating calib* and mood* databases from glastlnx01/02 to mysql-node03

May 10 2012

  • [10am-12:30pm] Oracle quarterly update. This will affect pipeline, data catalog, flight operations and any other databases on the main Fermi Oracle server.

  • [10am-12:30pm] xroot server reboot for OS upgrade. This will affect all 36 of the wain (Solaris) xroot servers.

  • [10am-12:30pm] Fermi USER DISK (wain006) reboot for OS upgrade.

  • [9am-3pm] xroot file server move. This will affect only two xroot servers: wain070 and wain071.

  • [9am-3pm] NFS file server move. This will affect the following servers which will be unplugged and physically moved to new rack space
    in building 50: sulky33, sulky34, sulky35, sulky36

  • No labels