Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The data retention practices described in this page are on a best effort basis, they depend on availability of the actual resources and may change at any time if these resources become unavailable. Historically we have been able to deliver on these lifetimes (with the exception of scratch where, twice, (where we had to delete data newer than 4 months, twice), but we cannot guarantee them: available resources depend on the actual usage rate and available funding, which we do not fully control. Notifications will be sent out when files are removed (scratch) or purged (xtc, hdf5) from disk.

Policy by Folder

Space

Quota

Backup

Lifetime

Comment

xtc

None

Tape archive

4 months

Raw data 

usrdaq

None

Tape archive

4 months

Raw data from users' DAQ systems

hdf5

None

Tape archive

4 months

Data translated to HDF5

scratch

None

None

4 months

Temporary data (lifetime not guaranteed)

results

4TB, 10K files

Tape backup

2 years

Analysis results (star)

calibNoneTape backup2 yearsCalibration data

User home

20GB

Disk + tape

Indefinite

User code

Tape archive

-

-

10 years

Raw data (xtc, hdf5, usrdaq)

Tape backup--IndefiniteUser home, results and calib folder
Disk backup--Indefinite

Accessible under ~/.zfs/

...

  • Please do not store under the scratch folder data that you cannot recreate because this directory is not backed up and the oldest files on scratch may be deleted at any time to make space for data from new experiments.
  • For files in scratch/, results/ and calib/ the age is determined using last modification time of a file (not access time).
    For the xtc and hdf5 files the access time is used (see
     xtc,/hdf5 cleanup).

  • The tape archive (xtc, hdf5, usrdaq) and the tape backup (results, home) are fundamentally different:
    • In the tape archive the folders are frozen after the end of the experiments and their contents are stored on tape once. 
    • In the tape backup, the system takes snapshots of the folders as appear at a given time. This implies that files which are deleted from disk are eventually, i.e. after a long enough time, also deleted from tape. 
  • Files under xtc and hdf5 can be restored from tape using the file manager tab in the electronic logbook. Files under home can be restored by the user following the instructions below.  To restore data from results send an email to cds-datamgt-l@slac.stanford.edu.
  • For raw data the cleanup operations will affect all files, i.e. all streams and chunks, which make up one run, rather than individual files.
  • After 2 years from the end of an experiment we'll remove the experiment from disk. At that point we'll take a snapshot of the results and calib folders and archive them to tape so that we can, upon request, restore an entire experiment back to disk.
  • After 10 years we plan to remove the tapes with the archived raw data from the silos and store them in a safe environment.
  • The new policy will apply to all experiments, i.e. it will be retroactive, and its deployment date will coincide with the start of Run 14 (August 10th 2016).
  • For questions regrading the data retention and data access send your question to: pcds-datamgt-l@slac.stanford.edu.

User Home
Anchor
userhome
userhome

  • Please do not store large files under your home, this space is meant for code/scripts, documents, etc, not science data.
  • Users can check the used and available space under their home with a command like:

...