Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

 

 

The Shared (General) Farm consists of several physical clusters that are available to all SLAC users. The cluster hardware was purchased incrementally over several years by various stakeholder groups. Each physical cluster is based on a specific hardware model but all hosts in the farm run 64bit RHEL/CentOS6. The LSF "general" queues feed user's jobs to the shared farm. The stakeholders have associated LSF user groups with a fairshare (scheduling priority) that reflects their cluster investment. This ensures that stakeholders will always get some runtime on the cluster when utilization is high. Users that are not members of stakeholder groups still have the ability to run jobs "for free". There is a superset fairshare group "AllUsers" that includes all SLAC users. A non-stakeholder must compete for priority with all other users running jobs on the shared farm. This may be acceptable for some, but production environments may demand priority scheduling. The free "AllUsers" fairshare is actually subsided by the paying stakeholders. A stakeholder's fairshare value is derived from the compute power they have purchased. A HS06 CPU benchmark is calculated for each cluster server model.

...

The 15% tax results in 14523 shares for AllUsers

LSF fairshare user groups

The stakeholders are responsible for distributing their shares among their respective LSF groups. The table below should match the production LSF config:

USER/GROUPHS06_SHARES%HS06_SHARESHS06_OWNER
    
atlasgrp3115732.18ATLAS
babarAll78598.12BaBar
AllUsers1452315Everyone (15% tax)
glastdata8540.88Fermi
glastusers2318123.94Fermi
glastgrp3660.38Fermi
geantgrp38744Geant
luxlz35003.61PPA others
cdmsdata20002.07PPA others
lcdprodgrp11001.14PPA others
exoprodgrp15001.55PPA others
hpsprodgrp10001.03PPA others
rpgrp5000.52PPA others
lcd6000.62PPA others
exousergrp5500.57PPA others
rdgrp00PPA others
theorygrp42574.4Theory

Decommissioning clusters

Fairshares should be subtracted from stakeholder groups and the associated AllUsers tax when hardware is declared obsolete and put into run-to-fail mode. The fairshare distribution should always reflect stakeholder investment in current production cluster resources. 

Chargeback model for sustainable hardware lifecycle

...