Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • New servers: fermi-gpfs01 and fermi-gpfs02
    • Dual-connect storage between these two machines
    • Internet connectivity (2 x 10 Gbps per host?)
    • Install GPFS
    • Install xrootd
    • Balance data across xroot cluster
  • New NFS/GPFS service on former fermi-xrd01 and fermi-xrd02 (and wainXXXwain069, wainYYYwain071)
    • fermi-xrd01 and fermi-xrd02 
      • Drain xroot data (~180 TB)
      • Swap 2x R610 with SCS-owned R720 machines
      • Rename (fermi-gpfs03 and fermi-gpfs04?)
      • Internet connectivity (1 x 10 Gbps per host?)
      • Decide upon storage configuration
        • Number of spindles for Users/Groups (from wain025)
        • Number of spindles for production partitions (from wain026 and wain032)
        • Number of "spare" spindles for future expansion
      • Install GPFS (total capacity will be ~160 TB)
      • Migrate wain025, wain026, wain032 to new system
    • Select two wains for CNFS service (lots of memory + fast ethernet)
      • Drain xrootd data from wain069 and wain071 (~60 TB, both are unreliable host due to the Seagate disks)
      • Swap the 4x1Gb interface with 10Gb nic from wain080 and wain081
      • Rename hosts (fermi-cnfs01, fermi-cnfs02?)
      • Install GPFS and CNFS software
      • Configure so that wain025 partitions are handled in such a way that their activities do not negatively impact access to other partitions
  • Upgrad NFS service for LSST
    • Repurpose wainZZZ to replace wain006
  • Drain xroot from other wains that are to be retired
  • Retirement list:
    • wain026
    • wain032
    • wain006
    • wains that are unreliable (due to Seagate drives?)
      • wain053  (29TB to be moved)
      • wain054  (22TB)
      • wain055  (27TB)
      • wain056  (30TB, total = 108TB)
    • Retirement option:  Given that wain05x are the newest and most powerful Sun servers in the cluster, we might consider swapping physical disks from wain017/019/020/021 with wain053/054/055/056, then retiring the older machines. .. Wilko rightfully points out that this option requires more labor.  Is it worth the extra work?

Milestone Timeline

 

DateMilestone
7/1/2014new servers arrive, fermi-gpfs01 and fermi-gpfs02
7/30/2014storage arrays arrive
9/18/2014cables located, beginning of GPFS testing at SCS
1/13/2015xrootd in production (readonly), and data migration/balancing begins
2/18/2015agree upon general Fermi storage plan
??fermi-xrd01/02 and wainXXXwain069/YYY 071 drained
??former fermi-xrd01/02 + wainXXXwain069/YYY 071 configured for CNFS service
??NFS data migrated to new service and ready for production
??wain006 migrated to wainZZZ
??all remaining retirees drained

...