Outages

All times are PT. Red entries are active. Most recent entry first.

Nodes	Services	Start Time	Expected End Time	Actual End Time	Reason	Comments
============ Ongoing ============
psanaoss231	ana13	Aug 5th, 5am			oss problem after power outage
psanaoss215	ana11	Dec 2			oss psanaoss215 of ana11 is down
drp-srcf-xxx	Fast feedback	Aug 15th, 5:00PM	Aug 18th, 3:00PM		Reorganizing DRP racks and connectivity, upgrading Weka FFB Cluster	No fast feedback analysis during this time.
Diskless nodes	IOC and DAQ nodes	Aug 16, 9:00AM	Aug 17, 1:00PM		Diskless server will be migrated to the weka cluster	All diskless clients will be rebooted to use the new network interface.
psana	Interactive pool, batch nodes	Aug 16, 9:00AM	Aug 18, 5:00PM		Upgrading file systems, batch and interactive nodes	No data analysis capabilities during this time.
psexport	All data mover services	Aug 16, 1:00PM	Aug 18, 5:00PM		Waiting for psana and DRP	No ability to move science data during this time.
============ Upcoming ============
MEC and CXI nodes	Network	Aug 18, 2:00PM	Aug 18, 5:00PM		Network upgrade	No control room workstations or IOCs working in MEC/CXI during this time.
All NEH and FEH nodes	IOCs and DAQ nodes, alcove DRP, and control room workstations	Aug 19, 6:00AM	Sep 18, 5:00PM		Electrical work in experimental halls	Systems in the experimental halls will be on-line whenever power is available. Note: psana is not in the experimental halls and won't be affected.
drp-neh-xxx	NEH DRP	Aug 19, 2:00PM	Aug 19, 5:00PM		Upgrading to the latest Lustre version	No DRP in the NEH alcove
============ Completed ============

pswww	Web Services	Aug 16, 6:00PM	Aug 17, 12:00PM	Aug 17, 4:15PM	Service failed after upgrade.	Permission issues from the Weka upgrade
psnx, pslogin, psdev,	All login services	Aug 16, 1:00PM	Aug 16, 5:00PM	Aug 16, 6:30PM	Upgrading host to latest packages and services	No ability to ssh into the system during this time.
psweka	NFS	Aug 16th, 9:00AM	Aug 16th, 1:00PM	Aug 16th, 5:00PM	Upgrading and re-configuring Weka Cluster	The users home won't be available during this time so nothing will work. This will require rebooting all LCLS servers to use the new NFS stack.
psanagpu, lustre	interactive, jupyter, anafs	Aug 5th, 5am		Aug 5th, 9:20am	power issue SRCF	Most system have been restored. Only ana13 needs some attention
Weka Cluster	NFS (home directories, central storage, all related systems)	5/22/2021	Unknown	5/24/2021 ~10pm	Under investigation	The LCLS IT team is working with the vendor to diagnose root cause and future actions. The software version was rolled back, and the system is performing more stably.
psdb0x, psdm0x, pswww0x	logbooks, movers, questionnaires, most of the data management infrastructure	May 18, 9:00AM	May 18, 12:00PM	May 18, 9:55AM	Upgrading all the NFS mounts to the new version of NFS.
psweka	All	April 28th, 8:00PM	April 28th, 9:00PM	April 28th, 9:45PM	Deploying NFS-Ganesha
psanaoss121	ana02	Apr 23, 18:15		Apr 26, 11:50	oss crashed
psweka	All	April 26th 8am	April 26th 12pm	April 26th, 11am	Weka upgrade	Nothing will work during this time
psdb0x, psdm0x	logbooks, movers, questionnaires, most of the data management infrastructure	Apr 21, 3:00PM	Apr 21, 4:00PM	Apr 21, 5:00 PM	Moving all machines to 10Gbps networks.	psdm02 had a bad NIC port, we had to reconfigure this to use the alternate.
psanaoss12n	ana02	Apr 15, 4:30pm	Apr 15, 7pm	Apr 15, 5:40pm	Replacing broken fans
psexport	globus, data transfers	March 24, 9:00am	March 24, 10:00am		reboot psexports to remove ana11/12
psanaoss121	ana02	Apr 6th, 17:00	Apr 7th, afternoon	Apr 7th, 11am	disk related hardware issues
HPSS	tape archive, restore	March 23, 6:00am	March 23, 6pm	March 23, 14:20	HPSS upgrade
drp-srcf	FFB for TMO+XPP/XCS	March 11th, 10am	March 11th, 7pm	March 12th, 1am	Disable SMT on DRP SRCF nodes
psdb(psdb4)	Questionnaire/File Restore/File Manager/LCLS 1 DAQ data mover	Mar 3 6:00PM	Mar 3 10:00PM		Moving databases to new cluster	The questionnaire, the file manager services and the LCLS1 DAQ will be unavailable as we migrate to newer machines.
All	ldap/dhcp/dns	Feb 24th at 10m	Feb 24th at 1pm	Feb 24th at 3pm	psrelay migration	Netconfig and reboots will fail during this time and name resolution lookups may be delayed for a few seconds.
psdb(psdb4)	Questionnaire/File Restore/File Manager/LCLS 1 DAQ data mover	Feb 11 9:00AM	Feb 12 2:30AM	Feb 12, 8:00PM	Moving databases to new cluster	The questionnaire, the file manager services and the LCLS1 DAQ will be unavailable as we migrate to newer machines. Reverted back to psdb4 owing to routing issues.
Instrument Network	all NEH/FEH computing	2/4/2021 10:30am		2/4/2021 11:20am	Power supply failure	Replaced. Redundant supply added.
pswww/pswebkdc	elog/file restore	Jan 20, 2021 ~4:00PM		Jan 21, 2021 ~1 am	Issues with VMWare hypervisor
psdm0x psdb0x	eLog/File restore etc	Dec 21	Dec 24	Dec 22	Upgrade of backends to new releases of software
ana03	ana03	Dec 17, 2:49AM		DEC 17, 10:15am	Lustre issue	psossana0303 was stuck and needed a reboot (hard reboot)
Ana file systems	ana13	Nov 18	Nov 20,	Nov23, 11am	one ost is not accessible for write	ost0xe can not be written to due to mds/ost issues. It has been set to ro but writes to ana13 are slowed down. Reboot required. Fixed by rebooting MDS, (OSSs were also rebooted)
pshub01	JupyterHub	Oct 29 1:00PM		Oct 29 5:44PM	Disk failure.	The is a disk failure in the filesystem that stores the JupyterHub sessions. We are trying to recover this and restart the node. We could not recover the sessions; please log out and restart your session.
<<All>>	Network File System (NFS)	10/19/20, 4pm	10/22/20, 9am	21 Oct 2020	Upgrade to new Solid State Drive-based, distributed clusters	Remaining /reg/[d,g,neh]/* NFS volumes (see Detailed list)
ANA file systems	ana02/03/11/13/15	10/13/20, 6.30am		10/15/20, 17:40	(Unscheduled/failure)	21:20 UED has been moved to weka-nfs: /cds/data/ued/ana, The ana-filesystems are accessible now. The FFB->anafs migration has started but it will take time to clear the backlog
<<All>>	CDS Router	10/5/20, 9am	10/5/20, 12pm	10/5/20, 1pm	Upgrade to new routers, 100Gb optics	During 4-day PAMM. Actual disruption should be order of minutes.
<<All>>	Network File System (NFS)	10/5/20, 12pm	10/8/20, 12pm	10/8/20, 12pm	Upgrade to new Solid State Drive-based, distributed clusters	4-day PAMM. Replace aging legacy systems. Service disruptions may endure for several days as hard-links in applications are identified and repaired in real-time. Only /reg/neh/opr/ and /reg/g/pcds/ mounts were migrated. Future outage(s) will address remaining volumes.
ANA	ana04	April 16th	May 22nd		ana04 is down	Hardware problem with one ana04 ost
ANA file system	ana14	Dec, 2019	may 27th		ana14 is down	Hardware problem with one ana14 oss
nfs server	home directories	Sep 29th		Sep 29th, 11:30	nfs server issues	All servers are back up. Most if not all issues have been fixed.
psnfs02	home directores on home5	Sep 3, 16:40		Sep3, 19:11	psnfs02 crashed
ANA FS; GPUs, Export, Batch nodes	ANA02, ANA03, ANA11,ANA15, psana, psanafarm, psexport	July 7th, 5:00PM	July 9th, 8:00AM	July 8th, 6:00PM	Circuit breaker replacement	Circuit breaker has been replaced and sensitivity decreased. In order to replace the unit, the entire distribution panel (UDB-C) must be powered off.
ANA	ana13	June 8th	June 10th	June10th, 10:45	psanaoss232 is down
Gateway, Build nodes, ML node	pscag1 - pscag4, psbuild-rhel5, psbuild-rhel6, psbuild-rhel7, psjerry	June 8th 7:30AM	June 8th 9:00AM	June 8th, 6:00PM	Power maintenance	Power maintenance completed. The gateway and build servers are now accessible.
VMware	pswww, pslogion, psdev	May, 18th, 08:00	May 18th	May 19th, 02:00		pslogin, psdev and psww are not accessible Access using the LCLS NX servers will continue to work
VMware	pswww, pslogion, psdev	May, 16th, 10:00	May 16th, 17:00	May 17th, 18:30	VM hypervisor upgrade	Access using the LCLS NX servers will continue to work
psnfs02	home directories	May 12th	May 13th	May 13th, 10:14	server is down	Some users home directories are not accessible
psexport	Globus, gridftp	May 11th	May 12	May 13th, 19:30	psexport01 is down	psexport works again, gridftp (globus) is available.
pslogin, pswww	login, web services	May 12th	May 12th,	May 12th, 11:20	virtual machine cluster	Most VM should be accessible now.
pswww	Data Management Portal	Feb 29	Mar 3 5PM	Mar 5th	Migrating databases to LCLS2 infrastructure	We'll be migrating all the databases to LCLS2 data management systems. The current data management portal should be available as read only.
ANA	Analysis Infrastructure	4 December 19 4:00PM PDT	Friday, December 13th	superseded by other outages	Troubleshooting short-circuit	While bringing up some of the file systems, we encountered electrical issues. The plan is to keep things stable until Monday to make sure the equipment which is currently powered on is working properly. Status: Up: psana, psanaq, psexport, ana02, ana03, ana04, ana12, ana15 Down until further notice: ana14 Update: 2020-02-25 ana11 is up and batch nodes psana12xx
ANA file systems	Analysis infrastructure	Nov 14th, 2019	Nov 27th	Dec 4th	Move from building 50 to building 54 (SRCF)	Update: ANA14, ANA12, ANA04, and ANA02 up by Friday ANA11 up by Monday Completed tasks: ana03, ana12, ana13, and ana15 are up Most of the psana batch queue is now available. The psana interactive pool can now be accessed, but several nodes are still down(e.g psanagpu115, psanagpu116, etc). The psexport nodes are operational.
pslogin, kerberos	pslogin, passwordless access	November 8, 2019 11:45AM	TBD	November 11th	2 Hypervisors are unresponsive
	ANA file systems	April 11, 12:30pm	Unknown	April 11, 7:30PM	SLAC wide power glitch	7:30PM: ANA14 is now online. All systems are up and functional. 6:00PM: All ANA lustre filesystems have been recovered except ANA14 which is suffering major hardware issues. We don't know yet how long it will take to recover it. 12:30PM: All systems are up by now except the ANA file systems because the Lustre MDS was damaged. We don't know yet how long it will take to recover it.
All	All nodes	January 4th	January 11th	January 25th (User Services)	Server room relocation	Systems that are up: psnxserv psana (interactive) psexport psdev pslogin pswww(elog) pshub(JupyterHub) psana(batch)
psana, psexport		Aug 13, 7pm	Aug 14	Aug 14	Configuration error	Wrong MTU setting in building 50 causes psana and psexport to not be able to mount NFS. Will fix this morning.
All	All nodes	July 26, 5:00PM	July 30, 5:00PM	August 2nd	Power Outage	Systems are now on-line fully functional. We encountered several systems with bad hardware and corrupted files. Unnecessary systems are now off-line.
psnfs03 and psnfs04	/reg/g and /reg/common	Apr 24, 6:30am	Apr 24, 9am	Apr 24, 8:30	firmware update
psnfsopr	Operators home	Apr 24, 6:30am	Apr 24, 9am	Apr 24, 7:30	Move and firmware update
All	Networking will be down, so all machines and services will be unavailable.	Jan 17, 2:45 PM	Jan 17, 5:00 PM	Jan 17, 4:45 PM	Central Router firmware upgrades and replace NFS SAS module.	The new SAS NFS module woked, but psnfs03 and pnfs04 took longer than expected to boot.
The pslogin, psdev and psnxserv nodes will be unavailable.	User home directories and some /reg/* NFS shares will be unavailable	Dec 26, 6:00 AM	Dec 26, 8:00 PM	Dec 26, 9:30pm	Firmware upgrades	Problems found with one of the NFS servers (psnfs03), may need to take another outage to fix
All ANA filesystems, interactive (GPU & Phi) nodes, psexport nodes and LSF batch nodes	Science data and associated servers	Dec 26, 6:00 AM	Dec 31, 5:00 PM	Dec 31, 8:00 AM	Electrical Work
HPSS Storage System will be unavailable	Data backup and recovery	Dec 25, 6:00 AM	Jan 8, 5:00 PM	Jan 1	Electrical Work
All ana filesystems, interactive nodes, psexport nodes and most batch nodes	Science data and associated servers	Aug 22, 1PM	Aug 23, 8:00 PM	Aug 23, 10:00PM	Cooling and Electrical Work	Outage recovery was late because we had several hardware problems which required intervention.
~~LCLS Computing~~	~~All LCLS computing services~~	~~July 25, 4:00 AM (PDT)~~	~~July 25, 6:00 PM (PDT)~~		Electrical Work	Outage Canceled.
All ana filesystems, interactive nodes, psexport nodes and most batch nodes	Science data and associated servers	June 7, 5:30 AM (PDT)	June 7, 6:00 PM (PDT)	June 7, 6:53 PM (PDT)	Electrical Work	Outage recovery was an hour late because server room electrical work extended beyond anticipated outage time and a Lustre-system RAID card failed.
ana02	Science data	Wed, April 19, 11 AM	Wed, April 19, 4 PM	Wed, April 19, 4 PM	Update OS and Lustre version
psnxserv03	psnxopr				No Machine upgrade	Please use psnxserv01 and 02 while 03 is upgraded
All	NFS	Fri Feb 24, 2017 9am	Fri Feb 24, 2017 9pm	Fri Feb 24, 2017 8:08pm	NFS upgrade	During this outage it won't be possible any users or operator home
ana12	Science data	Aug 16, 9am	Aug 16, 12pm	Aug 16, 12pm	Hardware failure	Access to six OSTs is very slow, presumably because of a failing RAID card. We will shutdown one of five ana12 OSSs to replace the card and reconfigure as needed. No data is expected to be lost, but some data will be unavailable during the outage.
ana04	Science data	May 5th, 10am	May 10th, 10am	May 30th	Hardware failure	One of the ana04 OSTs doesn't detect a drive. There is enough redundancy to rebuild the array, but it's extremely slow and sometime it hangs. The OST has been set read only. Moving the data to other OSTs. End: file system couldn't be recovered, but we were able to move all the data somewhere else, wipe and rebuild the file system.
Batch nodes	Batch jobs	Oct 5, 2015 11am	Oct 5, 2015 1pm		Move to RHEL7	Interactive nodes will also be moved to rhel7. Users logged into rhel5 interactive nodes will be able to continue their session, but batch submission will fail once batch nodes are converted to rhel7.
HPSS	Restoring files from tape	Sep 22, 2015 7am	Sep 24, 2015 5pm	Sep 24, 2015 12pm	Upgrade of HPSS to version 7.4	HPSS will become read-only on Sep21th at 5pm.
psnehprioq/psfehprioq	All nodes	April 1, 2015 9am	April 1, 2015 8pm	March 31, 2015	Maintenance on nodes to allow addition of 640 cores to computing system	Ended early due to technical issues.
psnehq/psfehq	All nodes	March 31, 2015 11am	April 1, 2015 8pm	March 31, 2015	Maintenance on nodes to allow addition of 640 cores to computing system	Ended early due to technical issues.
All NEH/FEH computing	All LCLS computing services	Dec 31, 00:00hrs	Dec 31, 18:00hrs		Switching of generator power to building power at Building 950 where servers are housed.
All NEH/FEH computing		Nov. 7, 2014 ~12pm	Unknown	7:30pm	Unscheduled power outage	Power has been restored. We are running file system checks for our NFS servers before we could bring everything else back up. We are hoping to get things back up by 4.30pm.
LCLS Offline Filesystem	All LCLS Offline filesystems	Sep 3, 2014 ~7.30pm	N/A	Sep 4, 2014 12:45pm	Network outage causing offline filesystems to be inaccessible.	We have identified the root caused and issue should be fixed now.
LCLS Computing	All LCLS computing services	Friday Aug 8th, Midnight (00:00 hrs)	Monday Aug 18th, 2014 Noon	Aug 18th, 2014 12pm	To prepare for the power shut down on Aug 11. To perform hardware and software maintenance before the power outage.	Not all machines will be shut down at once as we start performing system updates. Please do not depend on any service pass midnight Aug 10th.
psexport, psana104, psana105	Export nodes, HPSS, scratch and calib backups	Thursday August 7th, Noon	Monday Aug 18th, 2014 COB		These machines need to be moved to Bld 50.
LCLS Online Computing	All Online Computing Nodes, DSS, FFB	Tuesday Aug 5th, 2014 5pm	Wednesday Aug 20th, 2014 COB		CXI DSS nodes and ffb nodes need to be recabled and reconfigured. The IB switch is moved to B50 as part of the offline storage move.
LCLS Offline Filesystem	All LCLS Offline filesystems ana01, ana02, ana03. ana04. ana11, ana12, ana14	Tuesday Aug 5th, 2014 5pm	Monday Aug 18th, 2014 COB		Offline Analysis Hardware will be moved from B950 and B999 to B50.	The equipment needs to be disconnected, and carefully moved before the power outage on Aug 11, and reconnected at B50.
LSF	LSF Job Submission and Management	Wednesday, Mar 19, 2014	N/A	Wednesday, 6:45pm	Unplanned software outage	No job can be submitted and managed at this time. We were informed that software vendor has been contacted, and SLAC Computing Division is working on the issue. Post-mortem from SLAC comp-out: The problem resulted from a bug in one of the new LSF 9.1.2 daemons. IBM is researching a fix and in the meantime we are still running 9.1.2, but that one daemon has been reverted to 9.1.1. We believe that jobs submitted prior to the problem should have continued to run and should continue to be tracked by LSF.
All computing services at LCLS (NEH, FEH, XRT, FEE, Undulator Hall)	All services at LCLS	Friday, Aug 9th, 2013 13:00hrs	Tuesday, Aug 13th, 2013 13:00hrs		Planned power outage at LCLS buildings
psana11,psana12 psana13, psana14	LSF Compute Nodes	Thursday, May 30th, 2013 16:30hrs	Friday, May 31st, 2013 12:00hrs		Unplanned power outage at SLAC	Most of the nodes are brought up. A handful have memory related problem and have been disabled in LSF awaiting diagnostic.
psanafeh		Thursday, May 30th, 2013 16:30hrs	Friday, May 31st, 2013 12:00hrs	Friday, May 31st, 2013 10:37am	Unplanned power outage at SLAC
psexport01		Thursday, May 30th, 2013 16:30hrs	Friday May 31st, 2013 12:00hrs	Friday, May 31st, 2013 11.00am	Unplanned power outage at SLAC
ana01, ana02	/reg/d/ana01, /reg/d/ana02 filesystems	Thursday, May 30th, 2013 16:30hrs	Friday May 31st, 2013 14:00hrs	Friday, May 31st, 6pm.	Unplanned power outage at SLAC
pssrv100 (psnfs)	NFS mountpoint for PCDS diskless nodes	Tuesday, Mar 26th, 2013 12:30pm	Tuesday, Mar 26th, 2013 5pm	Tuesday, Mar 26th, 2013 4:45pm
pssrv100 (psnfs)	NFS mountpoint for PCDS diskless nodes	Monday, Jan 7th, 2013 (1030 hrs)	Monday, Jan 7th, 2013 (1600 hrs)	Wednesday, Jan 9th, 2013 (1140 hrs)	RAID controller malfunctioned upon power restoral after planned power outage in B950 203A	pssrv101 (old data) was used to bring up the FEE nodes for part of the outage. pssrv100 was restored to operation after a new RAID controller was delivered and installed.
ana01	/reg/d/ana01 filesystem	Tuesday, Dec 18th 2012	unknown	Partial (98%) restoral Monday Dec 24th (0800 hrs)	Controller failed causing corrupted parity data	Parity errors fixed and new controller installed. 2 OSTs (LUNs) needed fsck'ing. One took a few hours, the other took 10 days.
psanaoss21*	/reg/d/ana12 filesystem	Monday, Oct 8th, 2012 (1700 hrs)	Monday, Oct 8th, 2012 (1900 hrs)	Monday, Oct 8th, 2012 (1900 hrs)	Hardware upgrades
psanaoss2**	/reg/d/ana11 and /reg/d/ana12 filesystem	Thursday, Sep 27, 2012 (1700 hrs)	Friday, Sep 28, 2012 (0100 hrs)	Friday, Sep 28, 2012 (0400 hrs)	Hardware upgrades
Sitewide outage. All Linux Servers at NEH, FEH, XRT, FEE.	All computing services at LCLS.	Wednesday August 15, 2012 1:15 PM (1300 hrs)	August 17, 2012 1:00 PM (Except psanafeh, ana11 and ana12 file systems which will be down till Aug 21, 2012.		SLAC sitewide power outage on August 16. Electrical work at LCLS. Server maintenance.	Expect logging in to any machines to be unavailable between 8/15 and 8/17 even if some of the servers are powered up before the expected end time. They will be maintenance performed on various servers during these 2 days.
All machines in XPP hutch and control room will be inaccessible.	XPP	Monday April 9, 2012 11:15AM	Monday April 9, 2012 11:45AM	Monday April 9, 2012 11:30AM	Electrical Work at XPP Hutch	Completed
	ana01/ana02 file systems	Wed Mar 28th, 2012 9am	Wed Mar 28th, 2012 1pm	Wed Mar 28th, 2012 4pm	Upgrade to IB	Completed
psananeh lclsq ana01 ana02	NEH storage and processing	Wed Dec 21, 2011 6am	Tue Dec 27, 2011 4pm	Mon Dec 26, 2011 1pm	Chilled water outage	Completed. Chilled water was restored on Friday.
psana batch nodes	All Science data is currently unavailable. Psananeh psanafeh is up for Matlab use, but no access to data on Lustre file system.	Saturday Oct 1, 2011 6am			Lustre file system remains down after the unplanned power outage on Saturday.	The system administrators are working to bring them back.
	All LCLS computing services	Monday Nov 14, 2011 7am			Electrical work at NEH server room and FEH.	pslogin is up. NFS server, LDAP, DNS, pswww are up. The daq nodes will not come up until after 4PM. Lustre will not come up until after about 4.30PM. Batch nodes (psana11* psana12) and psana01 will not be up until Lustre is up.
psana, NEH Online Nodes, psimport, psexport, pslogin, psdev. psanasrv100, psanasrv101, psanasrv102	All Science data, All user home directories, all DAQ cache nodes. All online services.	Wed Sep 28, 2011 10am	Wed Sep 28, 2011 6pm	Wed Sep 28, 2011 6pm	Upgrade of Lustre hardware. Installation of taylor on several offline systems. Update of kernel on Online nodes.
psana	Science data access	Tue Sep 20, 2011 11:15am		Tue Sep 20, 2011 6:15pm	NEH power outage	B950 and several other buildings experienced short power glitch but the lustre file servers did not survive the interruption and is still being brought up.
psana	Science data access	Thu Jun 2, 2011 1pm	Thu Jun 2, 2011 5pm		Lustre failover testing.
NEH online nodes ana02 psexport, psimport	NEH DAQ, outside ssh access	Thu May 25, 2011 noon	Thu May 25, 2011 7pm		Server room upgrade, ana02 memory upgrade	Completed
psana	Science data access	Thu May 12, 2011 1pm	Thu May 12, 2011 6pm	Thu May 12, 2011 6.30pm	Lustre maintenance	Completed. Upgraded memory on psanaoss101-104, and replaced 10Gbit cards with 1 port SMCs. 717W power supplies are in place on psanaoss103-104 now.
psana	Science data access	Thu May 5, 2011 1pm	Thu May 5, 2011 5pm	Thu May 5, 2011 5pm	Lustre maintenance	Completed
All	All	Fri Apr 29, 2011 6.30pm	Sun May 1, 2011 11pm	Sun May 1, 2011 9pm	NEH power outage	Completed
psana	Science data access	Thu Apr 28, 2011 2pm	Thu Apr 28, 2011 6pm	Thu Apr 28, 2011 3pm	Lustre maintenance pssrv100 NFS volume reconstruction.	Completed Lustre maintenance postponed. Raid reconstruction pssrv100 will take 2-3 days. The new volume size is not released by the controller, so we will have to perform the file system resize on another day.
psana	Science data access	Fri Apr 1, 2011 6pm	Mon Apr 4, 2011 10am		NEH cooling outage	Completed
psana	Science data access	Thu Mar 31, 11am	Thu Mar 31, 5pm		Enabling HA for Lustre system	Completed
All	All	Sat Mar 26, 2011 7am	Sat Mar 26, 2011 7pm	Mon Mar 28, 2011 1pm	NEH power cut	Completed
psana	Science data access	Thu Mar 24, 2011 11am	Thu Mar 24, 2011 5pm		Lustre testing	Completed
All	All	Wed Mar 23, 2011 10am	Wed Mar 23, 2011 3pm		NEH power cut	This power cut was NOT planned
All	All	Sat Mar 19, 2011 7am	Sat Mar 19, 2011 7pm	Mon Mar 21, 2011 10am	NEH power cut	Completed

Space shortcuts

Child pages