This page will describe the setup at S3DF that will be used starting in summer of '23.

Details can be found on Running at S3DF.

Overview

Many MEC user group rely on tiff files for each detector in each event/shot taken. The necessary jobs are setup in the elog/ARP and are typically automatically triggered. A standard experiment has the following jobs:

This setup is now using smalldata_tools, where the producer is given a few options

--full: this option will translate all  the data to hdf5 files. This results in a large hdf5 files and is generally not recommended, both for disk space reasons & because this places the burden for further analysis at later steps of the analysis chain

--image: this option will store the data of tiled detectors as a single 2-d image. This is generally not recommended as this can result in discontinuities of observed features when the detector pixels are not perfectly aligned in x & Y, We usually recommend using data in raw-data shape and use the x,y&z values for each pixel in further analysis

--tiff: in addition to the hdf5 files, each dataset in 2-d shape per event will be also stored as tiff file in the scratch directory. 

Reaching S3DF

To reach s3df, you can access 's3dflogin' via ssh (or from a NoMachine/nx server). To be able to read data/run code, you need to go to the 'psana' machine pool (ssh psana).

Copy data to your home institution (or laptop):

To copy data to your home institute, if you used to use psexport, you will need to switch to:

s3dfdtn.slac.stanford.edu

More information can be found on Downloading Data

Working directories

in S3DF, we have a single space for each experiment. The data can be read from two different locations: the ffb and the offline. The data is first available in the ffb and will then move towards the offline. The limited size of the ffb means that data will only be available there temporarily. As rule of thumb, use the offline if present.

The directory structure is as follows:

/sdf/data/lcls/ds/mec/<experiment>/
drwxrws---+ 1 psdatmgr ps-data  0 Mar  3  2022 calib
drwxr-s---+ 1 psdatmgr ps-data  0 Feb 23 14:51 hdf5
drwxrws---+ 1 psdatmgr ps-data  0 Feb 23 15:24 results
lrwxr-x---  1 psdatmgr xs      42 Feb 16 08:07 scratch -> /sdf/scratch/lcls/ds/mec/<experiment>/scratch
drwxrws---+ 1 psdatmgr ps-data  0 Feb 24  2022 stats
drwxr-s---+ 1 psdatmgr ps-data  0 Feb 16 15:23 xtc

The ffb is a mount of the filesystem in the DRPSRCF used already in the previous years:

/sdf/data/lcls/drpsrcf/ffb/mec/<experiment>/

Be aware that data will only kept there as long as there is space. It is unlikely to contain your all of the data and data may need to be cleaned off before the end of the ongoing shift.

The results folder should hosts most of users' code, notebooks, etc. 

For more dedicated data processing (covered later), the smalldata_tools working directory generally is

/sdf/data/lcls/ds/<hutch>/<expname>/results/smalldata_tools 

Jupyterhub


The Jupyterhub setup at S3DF is described here

Detector data - image versus raw data

<I will add an example picture where you can see the effect of going from 3d to 2-d space

One time processing versus psana use (standard smalldata workflow)

A typical pattern in hutches that take a data usually at 120 Hz rates is to process data once and use the results to do further analysis. Reducing the '3d' detector data into a q-phi space is such an example as are photon finding. Reading & calibrating the detector image can be CPU intensive as can be detailed photon finding. We typically recommend doing this step once, saving the results in an hdf5 file and do further analysis (e.g. normalizing with i0 detectors,...) on this files.







  • No labels