You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 44 Next »

All times are PDT. Red entries are active. Most recent entry first.

Nodes

Services

Start Time

Expected End Time

Actual End Time

Reason

Comments

psananeh
lclsq
ana01
ana02

NEH storage and processing

Wed
Dec 21,
2011
6am

Tue
Dec 27,
2011
4pm

Mon
Dec 26,
2011
1pm

Chilled water outage

Completed. Chilled water was restored on Friday.

psana batch nodes

All Science data is currently unavailable. Psananeh psanafeh is up for Matlab use, but no access to data on Lustre file system.

Saturday
Oct 1,
2011
6am

 

 

Lustre file system remains down after the unplanned power outage on Saturday.

The system administrators are working to bring them back.

 

 

 

All LCLS computing services

Monday
Nov 14,
2011
7am

 

 

Electrical work at NEH server room and FEH.

pslogin is up. NFS server, LDAP, DNS, pswww are up.
The daq nodes will not come up until after 4PM.
Lustre will not come up until after about 4.30PM.
Batch nodes (psana11* psana12*) and psana01* will not be up until Lustre is up.

 

 

psana,
NEH Online Nodes,
psimport,
psexport,
pslogin,
psdev.
psanasrv100,
psanasrv101,
psanasrv102

All Science data, All user home directories, all DAQ cache nodes. All online services.

Wed
Sep 28,
2011
10am

Wed
Sep 28,
2011
6pm

Wed
Sep 28,
2011
6pm

Upgrade of Lustre hardware.
Installation of taylor on several offline systems. Update of kernel on Online nodes.

 

 

 

psana

Science data access

Tue
Sep 20,
2011
11:15am

 

Tue
Sep 20,
2011
6:15pm

NEH power outage

B950 and several other buildings experienced short power glitch but the lustre file servers did not survive the interruption and is still being brought up.

 

 

psana

Science data access

Thu
Jun 2,
2011
1pm

Thu
Jun 2,
2011
5pm

 

Lustre failover testing.

 

NEH online nodes
ana02
psexport, psimport

NEH DAQ, outside ssh access

Thu
May 25,
2011
noon

Thu
May 25,
2011
7pm

 

Server room upgrade, ana02 memory upgrade

Completed

psana

Science data access

Thu
May 12,
2011
1pm

Thu
May 12,
2011
6pm

Thu
May 12,
2011
6.30pm

Lustre maintenance

Completed. Upgraded memory on psanaoss101-104, and replaced 10Gbit cards with 1 port SMCs. 717W power supplies are in place on psanaoss103-104 now.

psana

Science data access

Thu
May 5,
2011
1pm

Thu
May 5,
2011
5pm

Thu
May 5,
2011
5pm

Lustre maintenance

Completed

All

All

Fri
Apr 29,
2011
6.30pm

Sun
May 1,
2011
11pm

Sun
May 1,
2011
9pm

NEH power outage

Completed

psana

Science data access

Thu
Apr 28,
2011
2pm

Thu
Apr 28,
2011
6pm

Thu
Apr 28,
2011
3pm

Lustre maintenance
pssrv100 NFS volume reconstruction.

Completed
Lustre maintenance postponed.
Raid reconstruction pssrv100 will take 2-3 days. The new volume size is not released by the controller, so we will have to perform the file system resize on another day.

psana

Science data access

Fri
Apr 1,
2011
6pm

Mon
Apr 4,
2011
10am

 

NEH cooling outage

Completed

psana

Science data access

Thu
Mar 31,
11am

Thu
Mar 31,
5pm

 

Enabling HA for Lustre system

Completed

All

All

Sat
Mar 26,
2011
7am

Sat
Mar 26,
2011
7pm

Mon
Mar 28,
2011
1pm

NEH power cut

Completed

psana

Science data access

Thu
Mar 24,
2011
11am

Thu
Mar 24,
2011
5pm

 

Lustre testing

Completed

All

All

Wed
Mar 23,
2011
10am

Wed
Mar 23,
2011
3pm

 

NEH power cut

This power cut was NOT planned

All

All

Sat
Mar 19,
2011
7am

Sat
Mar 19,
2011
7pm

Mon
Mar 21,
2011
10am

NEH power cut

Completed

  • No labels