A place to track projections for hardware going into Building 50 - for power, space and cooling planning purposes.

Hardware Proposal: PPA has requested $600k of annual hardware funding from DoE to cover small experiment and peak computing needs. This is primarily for cycles; storage needs seem modest.

FY13-14 Projection/update

Raw inputs here

Fermi

$600k Options include

  • need ~3000 cores for 6 months for reprocessing, to be followed by similar load of MC sims.
  • replacing 2 Oracle servers
  • potentially another PB disk
  • unknown amount of tape (5 TB?)

ATLAS

  • curent allocation about 3400 cores. Agreed with ATLAS to increase  about 1000 cores. Always in use.

EXO

  • 200 core DC + bursts to 1000 needed

CDMS

  • much like EXO

KIPAC

  • "random" user needs approximated to 1600 cores

HPS

  • minimal at SLAC - Jefferson has agreed to provide the needed ~1M CPU-hrs
  • storage ~100 TB

BABAR

  • 2600 core allocation - no plan to add more in general queues. LTDA good for now?

HEP Theory

  • peak needs about 250k CPU-hrs in 1-2 week episodes: 1000-2000 cores

General Users

  • Allocation of about 2350 cores

Core numerology

Note: 1500 cores are hanging by a thread in the to-be-retired Black Boxes.

Rule of thumb used: 10 allocation units is ~1 core - taken from recent bqueues -l: yields about 12.5k cores total.

USER/GROUP   SHARES  PRIORITY  STARTED  RESERVED  CPU_TIME  RUN_TIME
exoprodgrp   1000     333.333      0        0         0.0        0
lcdprodgrp    836     278.667      0        0         0.0        0
cdmsdata     1000     166.667      1        0         0.0   654048
rpgrp         418     139.333      0        0         0.0        0
glastdata   25174      40.806     29        0   2709927.5   297435
glastgrp     8107      20.155     67        0   1020309.2    24120
hpsprodgrp    600      16.304      4        0    112119.7    72947
lcd           418       7.420      7        0    166287.5   133324
rdgrp         342       1.275      0        0   1364227.6        0
babarAll    26332       1.132    989        0 104347464.0 16872557
glastusers   2000       1.087    361        0   3881826.5 16058350
AllUsers    23545       0.780   2092        0 123027648.0 18098542
theorygrp    1000       0.628      5        0   8096284.0   140116
atlasgrp    34307       0.606   3172        1 242260000.0 140770233

FY13 Projection

Fermi

$600k Options include

  • replacing 2 Oracle servers
  • 35 nodes to go into the general queues
  • potentially another 400 TB disk
  • unknown amount of tape (5 TB?)
BABAR
  • ballpark $100k in various areas - replacing old file servers, tape etc. Optimization not done yet.
ATLAS
LSST
EXO
  • $32k for disk
  • $20k for standalone linux servers
  • $5k tapes
CDMS
  • $16k for disk, but would prefer buying in to a shared storage solution
HEP Theory
LCD

Mid FY12 Update

Fermi

$500k left to spend in CY12. Options include

  • replacing 25 glastlnx standalone servers (with perhaps 8 beefier ones)
  • contribute $35k for 2 5 TB tape drives
  • 35 nodes to go into the general queues
  • potentially another 400 TB disk
  • unknown amount of tape (1 vs 5 TB?)
BABAR
  • $35k for 2 5 TB tape drives
  • 20 nodes added to LTDA
  • line card for the LTDA switch, and new Etherite for the serial consoles
ATLAS
  • 2 cabinets of mostly storage
LSST
  • 4-5 compute nodes
EXO

Hoping for the following. Not sure how much of the disk money is really left due to other EXO expenditures.

  • $32k for disk
  • $20k for standalone linux servers
  • $5k tapes
KIPAC
  • replace interactive login servers ki-ls01-06
CDMS
  • nada
HEP Theory
  • 24 TB disk hoped for
LCD
  • 1 Dell R610 for grid access to their fileserver
Totals (roughly):

Storage

$

Standalone
servers

$

"Batch" nodes

$

Tape
Drives

 

Total $

1500 TB

500k

18

90k

60

300k

4

70k

960k



FY2012
BABAR

complete LTDA purchase - CD already knows the details. Retire old fileservers, using returned thumpers from Fermi.

Fermi

expect ~400 TB fileservers and 10 standalone linux servers. Presumably to be used for retiring old servers. 

CDMS

perhaps 1-2 fileservers

ATLAS

The ATLAS Tier 2 projection is that we will spend about $220k on hardware (more if recharge costs are low, less if they are high). In addition, it is a reasonable guess that there will be up to $50k of other ATLAS-related purchases.

According to my rules of thumb, this will translate into at most 90 rack units and 27 kW dissipated.

We should also expect to retire our three oldest Thumpers, and offer to retire our 79 Sun X2200 M2s that are in one of the Black boxes.

KIPAC

- our FY11 CEP for storage should complete soonish with about 1/2 rack (4x4U + servers), power density is modest (~250W/U)
- FY12 is unclear, if we get the requested amounts then we would likely get ~48core/2U machines filling 2/3 rack at higher power
   density (700W/U)
DES:
- If DES uses BaBar leftovers, we need to keep those running and buy storage.  Estimated upper limit is 1/2 rack at modest power
   density (~250W/U)
  
So in summary we need something like 2 racks, one at high power and one and medium/low.  It is possible that something old gets retired
but nothing significant this year.

EXO

We have budgeted approximately $32k/year for file servers, $20k/year for miscellaneous servers (or whatever) and $5k/year for tape for the next few years. Given that we are now taking data at a fairly steady rate we will need to review our requirements and may need to make some adjustments.

Theory

Marvin Weinstein

I have an LDRD proposal that is being considered for the development of DQC.  If it is funded I plan to purchase 3 machines with a minimum of 48 processors and a good GPU and 64 GB of ram and at least 2 TB of hard drive.  Since these machines will have 4 or more real CPUs, they will have to run Windows Server.  I assume building 50 is your domain, I certainly plan to try and house these beasties with you.

  • No labels