Performance
A document that shows SZ performance with LCLS crystallographic data (relative error bounds) is here. It does not have results for the absolute-error-bound case, which we think is more relevant for LCLS-II, but Chuck Yoon has studied that for crystallography data (see JIRA tickets).
An email exchange with Chuck and Stefano on Dec. 20, 2022
How fast is roibinsz running for a 2M detector image? Not including det.calib.
Table VI captures the compression rate on arXiv (https://arxiv.org/abs/2206.11297)
Lysozyme dataset was on Jungfrau4M => 89.89 GB/s on 400 cores.
# Convert to ms/image/core
Given 4M pixels x float32 (4 bytes) = 32 MB/image,
89.89GB/32MB => 2809 images/s on 400 cores => 7 images/s/core => 142ms/image/core
For a 2M detector, this would take half the time => 71ms/image/core
For epix10k2M (mfxlv4920), I was getting 181ms/image on a single core for det.calib using psana1.
Is 20ms/image det.calib time from psana2?
reply:
Hi Chuck,
Yes, for the default det.calib I get 170ms/image for an epix10ka 2Mpx on psffb. However, if I disable common-mode (which I hope we can do for drp) that time drops to 20ms/image with cmpars set to zero:
det.calib(evt,cmpars=[
0
,
0
,
0
,
0
])
chris
Thanks Chuck,
This 71ms number for roibinsz + 20ms for calibration does feel tight on 10 nodes. Implies 10Hz per core, or 500 cores (8 nodes).
To do this feels like we would likely have to go to 20 daq nodes to have >2x overhead.
Ideally we would try it “now” in the daq using a mode of the kcu firmware that Matt has that can emulate a camera. We’ll have to use the nodes that are currently being used for ffb processing, I believe.
chris
Notes From Mtg With TXI Scientists
April 10, 2023 with Andy Aquila, Tim van Driel, Stefano Marchesini, Cong Wang, Chuck Yoon, Silke Nelson, Chris O'Grady
See EpixHR Emulator for results running SZ compression in the DAQ with epixHR emulator
Stefano showed results that suggest 9x compression doesn't seem to compromise physics results. Andy/Tim agreed, but Andy requests to look at differences with/without compression. They say this dataset is representative of higher-intensity saxs/waxs data, which is good news.
saxs/waxs datasets to look at:
- gas phase
- low intensity jet
- detector calibration with constant sample and varying intensities to study systematics
2x2 high-q binning is a possibility, but not at the center
Next physics topics on the list:
- fluctuation saxs/waxs. mark hunter old not-very-good data. new expt coming up.
- spi: binning in q space because detector is not planar (needs fancy binning)
In future, photonizing data to look at:
- yanwen xcs
- minitti
High Intensity SAXS/WAXS Compression Results
Shown at April 10, 2023 mtg.
example frames:
The images below show the time evolution of SAXS plots (radially average scattering intensity vs q vs time), from the xcsx1001121 experiment with detector data compressed or uncompressed (and difference), and a slice through .
The analysis requires accurate knowledge of the beam center, and correct calibration of the time-sequence, so if either is corrected later, it needs to be repeated.
Compression factor of about 9 (maximum absolute error =100, SZ3 compression) does not appear to modify the final result (below).
Increasing the compression factor to 17 ((maximum absolute error =200, SZ3 compression) does appear to degrade the final result (figures not included).
analysis:
Low Intensity SAXS/WAXS Compression Results
Wednesday, January 17, 2024 at 13:30 - 14:00 Meeting with Tim Van Driel, Chris O'Grady, Cong Wang,
This work is done with psana1 conda env ana-4.0.50-py3.
XPP example low flux experiment = 'xpplz0620', runs=[249,250] :
SZ3 Compression with Absolute error bound of 5, compression ratio of 10
Given the results below, Tim Van Driel is comfortable with deploying the SZ3 compression scheme on low flux and high flux data:
Also discussed possibility of binning, with possibly no binning near the center, and/or small arks instead of 2x2 binning. However there is a concern with "bad pixels", such as shadows that may change from experiment to experiment. Other (2x-4x) faster compressors (SZx and SZp) do not achieve adequate compression. Next tests will be with GPUs.
Code for analysis with compression is in a fork of small data tools, and jupyter notebook, plus the "reduce scattering code" (in https://github.com/Solution-Phase-Chemistry/ReduceScatt )