Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • The FFB currently offers the fastest file system (WekaIO on NVME disks via IB HDR100) of all LCLS storage systems however it size is only about 400 500 TB.
  • The raw data will be kept on the FFB a week after an experiment ends however for data intensive experiments files might be purged even before an experiment ends.
    Files deleted from the FFB will be available only on one of the offline systems (psana, SDF, or NERSC).
  • The raw data are copied to the offline storage system and to tape immediately, i.e. in quasi real time during the experiment, not after they have been deleted from FFB.
  • The users generated data created in the scratch/ folder are moved to the offline storage when the experiment is deleted from the FFB.For the
  • When running on the FFB the xtc/ and scratch/ folder should be used for reading and writing ( below /cds/data/drpsrcf/...). The Lustre ana-filesystems should (must) not be used (only exception is calib/, see below).
  • The LCLS Jupyterhub allows to start notebooks on the psffb nodes which will have access to the data and the FFB scratch folder of an experiment time being, the new FFB system will be available only for FEH experiments. NEH experiments will still rely on psana resources.

You can access the FFB system from pslogin, psdev or psnx with:

...

You can submit your fast feedback analysis jobs to one of the queues shown in the following table. The goal is to assign dedicated resources to up to three four experiments for each shift. Please contact your POC to get one of the high priority queues, 1, 2, 3 or 34, assigned to your experiment.

Queue

Comments

Throughput
[Gbit/s]

Nodes

Cores/
Node

RAM [GB/node]

Default
Time Limit 

anaqFor the week after the experiment10082626112812hrs
ffbl1qOff-shift queue for experiment 1100856112812hrs
ffbl2qOff-shift queue for experiment 210018146112812hrs

ffbl3q

Off-shift queue for experiment 3

10056206112812hrs

ffbh1q

On-shift queue for experiment 1

100856112812hrs

ffbh2q

On-shift queue for experiment 2

10018146112812hrs

ffbh3q

On-shift queue for experiment 3

10056206112812hrs
ffbgpuqNvidia RTX A500010023212812 hrs

Note that jobs submitted to ffbl<n>q will preempt jobs submitted to anaq and jobs submitted to ffbh<n>q will preempt jobs submitted to ffbl<n>q and anaq. Jobs that are preempted to make resources available to higher priority queues are paused and then are automatically resumed when resources become available.

...

  1. scratch folder is made non accessible to users.
  2. files and directories below the ffb scratch/ are moved to the scratch/ffb/ on the offline filesystem: 
        /cds/data/psdm/<instr>/<expt>/scratch/ffb/ 
    except for hdf5 files in the smalldata folder (see next).
    Once the data are on the offline scratch the Data Retention Policy applies. The transfer preserves the files mtime which is used by the cleanup.
  3. hdf5 files below scratch/hdf5/smalldata/  are moved the the hdf5/smalldata/ folder on the offline filesystem, e.g.
    /cds/data/drpsrcf/mfx/mfx123456/scratch/smalldata/*.h5  -> /cds/data/psdm/psdm/mfx/mfx123456/hdf5/smalldata/
    1. Only h5 files (and directories) below scratch/hdf5/smalldata/ are copied to the hdf5/smalldata folder on the offline storage.
    2. Files that don't match the rule in a. will be moved to the offline scratch/ffb/hdf5 folder.
    3. If a h5 file already exists on the offline storage and is newer than the one on the ffb the ffb file will not be copied but just removed.

...

  • Each FFB file server (16 of them) has a 100Gb/s IB connection
  • Each batch node has a 100Gb/s IB connection
  • Batch nodes have either a 10Gb/s or a 1Gb/s Ethernet connection
  • All Ethernet Luster access goes eventually through psnfslustre02 and is limited 10Gb/s
  • The figure also shows which network is used for the different file path

...