Page History

The morning of the first Wednesday of the month is designated as a maintenance reboot window for some Linux servers. Linux servers can be automatically rebooted by taylor or chef, or they can be manually rebooted by a unix-admin team member.

This monthly reboot schedule facilitates the patching and activation of updated kernels, glibc, and other shared libraries on Linux servers. Periodic reboots to enable security patches are required by SLAC Cyber Security and the Department of Energy.

The first Wednesday of the month Linux server reboots are staggered between 1:30 AM - 7 AM Pacific Time for automatic reboots by taylor and chef, or 7 AM - 11:30 AM for manual reboots by a unix-admin team member.

If you would like to add any servers to the first Wednesday of the month reboot list, please email unix-admin@slac.stanford.edu, and indicate if you would like your server to be automatically rebooted by taylor or chef, or manually rebooted by a unix-admin team member.

See also:

Link in New Window

linkText	Interactive Login Pools Monthly Reboots
href	https://confluence.slac.stanford.edu/display/SCSPub/Interactive+Login+Pools+Monthly+Reboots

There is not currently a documented and announced standard maintenance windows for patching and rebooting of centrally managed unix hosts.

The classes of machines as described below needs to be clearly identified within the configuration management tools (taylor, chef).

Trivial patching (eg RPM updates of userland software) can be done outside these outage windows.

The outage windows are needed to enable new kernels, glibc, X11, etc related software which requires a reboot to be activated.

The scheduled outage window is not required to be used, but it is an announced outage to be used if necessary.

Linux Desktops

Taylored RHEL 5, RHEL 6
Cheffed CentOS 7, Ubuntu
Clarify definition of a desktop (does not include kiosk machines)
1. Single user personal productivity workstation
Follow similar schedule as Windows desktops
Communication via unixusers-l@slac.stanford.edu mailing list in addition to comp-out
1. depending on the type of outage and who it affects

Linux Storage Servers

Schedule determined by SCS Storage Team
Recommended: one per quarter outage window for patching and rebooting

Linux Infrastructure Servers (non-storage)

Examples include:
1. samba
2. ftp
3. web
4. chef
5. mail servers - quarterly
  1. Uy Chu said: you can group them together mailgate10/15 & mailgate11/16 and reboot those 2 pairing at a time
    1. they should be redundant and should not affect any mail if working properly as a pair
Follow similar schedule as Windows servers
Communication via unixusers-l@slac.stanford.edu mailing list

ERP Linux servers

Schedule determined by ERP computing owners (eg, Monica, Ram)
Recommended: one per quarter outage window for patching and rebooting

Linux Interactive Servers

Examples include:
1. rhel6-64 login pool
2. centos7 login pool
3. FastX login pool

Batch Compute Farms

Schedule determined by SCS HPC Team
Recommended: one per quarter outage window for patching and rebooting
Could be managed via compute scheduler so outage is non-disruptive and transparent
1. However, might be simpler for SCS staff to have a maintenance window to perform mass reboots

Non-OCIO Linux Servers (non-storage)

Examples include:
1. Fermi
2. BaBar
3. SUNCAT
4. EED
5. etc.
A science computing coordinator/contact person needs to be identified
Schedule needs to be negotiated between OCIO and the computing contact
Recommended: one per quarter outage window for patching and rebooting

...

Space shortcuts

Page tree

Versions Compared

Old Version 4

New Version Current

Key