You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 31 Next »

Figuring out what we need to get through. None of the pages mapping services to servers are up to date.

Version 1.2 (5 Oct 2017)

Power outage time-line (from Shirley).  Note that the times of "10am" and "5pm" are notional and current guesses, but could change.

DateTimeEquipmentAction
Fri 22 Dec 2017EOB?non-critical, general usepower off
Tue 26 Dec 2017before 10am

non-critical, special request

exp-critical

power off

power off

 10am Bldg 50 power off
 after 10amexp-criticalpower ON
  High-availcontinuously powered ON
Sat 30 Dec 2017before 5pmexp-criticalpower off
 5pm Bldg 50 power ON
 after 5pm

exp-critical

non-critical, special request

power ON

power ON

Mon 8 Jan 2018starting 8amnon-critical, general usepower ON

 

The following table of servers must remain powered up and operational for Fermi Level 1 to succeed.

  • Confirm current H.A. rack occupants.  See spreadsheet here (thanks Shirley!) https://portal.slac.stanford.edu/info/ITHelp/KB%20Assets/HA-Servers.xlsx
  • Confirm the VM-master for a given VM.  Use the 'node' command, e.g., $ node -whereis fermilnx-v12
  • Confirm the tomcat <-> service associations.  Table here.
  • Confirm the tomcat-VM associations in this table. Use the 'node' command, e.g., $ node -whereis glast-tomcat01

serverVM/servicefunctionHA rack
fermi-gpfs02 xrootd server 
fermilnx05-vmmfermilnx-v02xrootd redirector 
fermilnx07-vmmfermilnx-v12xrootd redirector 
wain031 (or equivalent) NFS storageyes
fermilnx01 LAT config, fastcopy and real-time telemetryyes
fermilnx02 LAT config, fastcopy and real-time telemetryyes
fermilnx05-vmmfermilnx-v03archiver 
fermi-oracle01 oracle primaryyes
fermi-oracle02 oracle secondaryyes
mysql05/06mysql-node03calibration, etc. DByes
hequNNN - hequNNN+24 batch hosts 
fermilnx03-vmmfermilnx-v07/tomcat01Commons, Group manageryes
fermilnx09-vmmfermilnx-v16/tomcat06rm2 
fermilnx07-vmmfermilnx-v05/tomcat08dataCatalog 
fermilnx09-vmmfermilnx-v17/tomcat09Pipeline-II 
fermilnx09-vmmfermilnx-v18/tomcat10FCWebView, ISOCLogging, MPWebView
TelemetryMonitor, TelemetryTableWebUI
 
fermilnx07-vmmfermilnx-v10/tomcat11DataProcessing 
fermilnx07-vmmfermilnx-v11/tomcat12TelemetryTrending 
(non-Fermi server)astoredata archive 
(non-Fermi server)trscrontokenized cron 
(non-Fermi server)lnxcroncron 
(non-Fermi server)(farm manager, etc.)LSF 
yfs01 (non-Fermi) AFSyes
yfs02 (non-Fermi) AFSyes

 

High availabilty racks

For general information about the High-availability racks, Shirley provided this pointer to the latest list:

"Service Now, Knowledge Base,  search for "High Availability" , following link for current servers"

And here is the current statement about high-availability functionality:

Current Services in HA Racks
•CATER application
•Confluence application
•Data center management tool
•Drupal web
•Email lists
•Email transport infrastructure
•ERP application
•Exchange email
•EXO application
•Facilities monitoring
•Fermi application
•IT Ticketing system
•Network infrastructure
•Site Security infrastructure
•Unix authentication infrastructure
•Unix AFS infrastructure
•Unix mailboxes
•Unix monitoring
•VPN 
•Windows authentication infrastructure
•Windows file servers and SAN
•Windows monitoring
•Windows web

 

 

 

The services for L1:

oracle

  • pipeline
  • data catalog
  • group manager

mysql

  • calibrations

tomcats

  • pipeline
  • data catalog
  • data processing

isoc servers
xroot

  • fermi-gpfs02 (xrootd server)
  • fermilnx-v02 (redirector)
  • fermilnx-v12 (redirector)


nfs

  • Pretty much everything that's currently on wain031

LSF

  • ~25 hosts should let us keep up

 

Here's what ISOC tasks need:

FASTCopy chain
--------------
wain031
fermilnx01
fermilnx02
trscron
fermilnx-v03 (Archiver)
Whatever the pipeline server runs on.
xroot servers
astore system


Web servers
-----------
tomcat01 Commons
tomcat06 rm2
tomcat09 Pipeline-II
tomcat10 FCWebView, ISOCLogging, MPWebView
TelemetryMonitor, TelemetryTableWebUI
tomcat11 DataProcessing
tomcat12 TelemetryTrending

 


Notes:

9/29/2017 - (TG and WK) added list of xrootd servers needed to bridge the gap in December; Started table of server names

  • No labels