1. Generation of the small data hdf5 files

The small data generation takes advantage of the local development of the PSANA framework. It can be customized and run both again the xtc files while they are being written on the ffb system (ongoing experiment only) as well as against the .xtc files on the offline system,. Parallel computing has been built-in. For a typical experiment Silke will set up the code in the directory /reg/d/psdm/<hutch>/<expname>/results/smalldata_tools for the offline running or /cds/data/drpsrcf/<hutch>/<experiment>/scratch/smalldata_tools when running in the new ffb.

A "driver" python file (typically called smd_producer.py in the producers subdirectory) can then be edited to, e.g., choose and optimize a different ROI on area detectors, define beam center for radial integration, define delay time range and bin sizes, etc. How to set up this "userData" or reduced data / feature extracted data is described in more detail in A. Configuring smalldata production.

The output will be saved to a directory that can be specified, by default this will be:

/reg/d/psdm/<hutch>/<expname>/hdf5/smalldata or /cds/data/drpsrcf/<hutch>/<experiment>/scratch/hdf5/smalldata

The filenames will be of the form: <expname>_Run<runnr>.h5

While the processing is generally done from the elog (ARP), if you would like to run an interactive test job with only a few events, you can use:

./arp_scripts/submit_smd.sh -r <#> -e <experiment_name> --nevents <#> --interactive

The full list of options is here:

ana-4.0.17) snelson@drp-srcf-eb003:/cds/data/drpsrcf/xcs/xcsx39718/scratch/smalldata_tools$ ./arp_scripts/submit_smd.sh -h
submit_smd.sh: 
Script to launch a smalldata_tools run analysis

OPTIONS:
-h|--help
Definition of options
-e|--experiment
Experiment name (i.e. cxilr6716)
-r|--run
Run Number
-d|--directory
Full path to directory for output file
-n|--nevents
Number of events to analyze
-q|--queue
Queue to use on SLURM
-c|--cores
Number of cores to be utilized
-f|--full
If specified, translate everything
-D|--default
If specified, translate only smalldata
                -i|--image
If specified, translate everything & save area detectors as images
                -T|--tiff
If specified, translate everything & save area detectors as images * single-event tiffs
--norecorder
If specified, don't use recorder data
                --nparallel
                        Number of processes per node
                --postTrigger
                        Post that primary processing done to elog to seconndary jobs can start
                --interactive
                        Run the process live w/o batch system

We will usually set up the production to run automatically through the ARP. The number of jobs is tuned to use as few cores as necessary to process data at data taking speed to keep the time before the files are available to a minimum while keep the the queue as empty as possible. This is useful in cases where the reduction parameters are stable for sets of runs (most of the XPP and XCS experiments fall into this category)..