Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The first Wednesday of the month is designated as a maintenance reboot window for some Linux servers.  Linux servers can be automatically rebooted by taylor or chef, or they can be manually rebooted by a unix-admin team member.

This monthly reboot schedule facilitates the patching and activation of updated kernels, glibc, and other shared libraries on Linux servers.  Periodic reboots to enable security patches are required by SLAC cyber security and DOE. The first Wednesday of the month Linux server reboots are staggered between 2 AM - 7 AM Pacific Time for automatic reboots by taylor and chef, or 7 AM - 11:30 AM for manual reboots by a unix-admin team member.

Date (and comp-out link)TimeServer hostnamesRHEL 6 kernel releaseCentOS 7 kernel release
Wednesday, March 4th, 20201 AM - 11:30 AM
Chef nodes:

chef-automate2
chef-build01
cups02
ftp1
ganglia04
ksa-c7a
mgmt-centos7
nagios04
novel02
nx3
samba03
Taylor nodes:

cdlogin1
cdlogin2
cdlogin3
mgmt-authproxy01
mgmt-rhel01
mgmt-rhel02
ns-test
ssrl-vip1
ssrl-vip2
tftp
tftp-rhel6
version01
vip1
 2.6.32-754.27.1 3.10.0-1062.12.1
   

There is not currently a documented and announced standard maintenance windows for patching and rebooting of centrally managed unix hosts.

The classes of machines as described below needs to be clearly identified within the configuration management tools (taylor, chef).

Trivial patching (eg RPM updates of userland software) can be done outside these outage windows. 

The outage windows are needed to enable new kernels, glibc, X11, etc related software which requires a reboot to be activated.

The scheduled outage window is not required to be used, but it is an announced outage to be used if necessary.

Linux Desktops

  1. Taylored RHEL 5, RHEL 6
  2. Cheffed CentOS 7, Ubuntu
  3. Clarify definition of a desktop (does not include kiosk machines)
    1. Single user personal productivity workstation
  4. Follow similar schedule as Windows desktops
  5. Communication via unixusers-l@slac.stanford.edu mailing list in addition to comp-out
    1. depending on the type of outage and who it affects

Linux Storage Servers

  1. Schedule determined by SCS Storage Team
  2. Recommended: one per quarter outage window for patching and rebooting

Linux Infrastructure Servers (non-storage)

  1. Examples include:
    1. samba
    2. ftp
    3. web
    4. chef
    5. mail servers - quarterly
      1. Uy Chu said: you can group them together mailgate10/15 & mailgate11/16 and reboot those 2 pairing at a time
        1. they should be redundant and should not affect any mail if working properly as a pair
  2. Follow similar schedule as Windows servers
  3. Communication via unixusers-l@slac.stanford.edu mailing list

ERP Linux servers

  1. Schedule determined by ERP computing owners (eg, Monica, Ram)
  2. Recommended: one per quarter outage window for patching and rebooting

Linux Interactive Servers

Reboot monthly via taylor and chef installed root cronjob (this is already being done for iris and flora, but iris and flora are on weekly reboot schedules).

  1. Examples include:
    1. rhel6-64 login pool
    2. centos7 login pool
    3. FastX login pool (this needs to be announced)

Batch Compute Farms

  1. Schedule determined by SCS HPC Team
  2. Recommended: one per quarter outage window for patching and rebooting
  3. Could be managed via compute scheduler so outage is non-disruptive and transparent
    1. However, might be simpler for SCS staff to have a maintenance window to perform mass reboots

Non-OCIO Linux Servers (non-storage)

...

  1. Fermi
  2. BaBar
  3. SUNCAT
  4. EED
  5. etc.

...

  

 

 

...

 

...

 

contentbylabel 

showLabels falsemax5spacesSCSshowSpacefalsesortmodifiedreversetruetypepagecqllabel = "kb-how-to-article" and type = "page" and space = "SCS"labelskb-how-to-article

...

hiddentrue

 

 

...