Page History
This script lives in /reg/g/psdm/tutorials/examplePython/userSmallHDF5_1.py and uses "h5py" (which is documented at http://www.h5py.org).
The first pattern would be used when you want to save all your small data at the end of the run (i.e. you know how many numbers you are going to save):
Code Block |
---|
import numpy as np import psana ds = psana.DataSource('exp=xpptut15:run=54:smd') cspad = psana.Detector('cspad', ds.env()) cspad_sums = [] NUMEVENTS = 3 for idx, evt in enumerate(ds.events()): if idx >= NUMEVENTS: break calib = cspad.calib(evt) if calib is None: continue cspad_sums.append(np.sum(calib)) import h5py h5out = h5py.File("userSmallData.h5", 'w') h5out['cspad_sums'] = cspad_sums h5out.close() |
In the second pattern we We do not assume we know the final size of the dataset: we use the hdf5 chunked storage and resize functions to grow a dataset. The same sort of pattern can be used to write in-memory data at the end of a run, but it is easier because at that time the size of the dataset it known. One might use this pattern if all the data for a run can't be stored in memory. This script lives in /reg/g/psdm/tutorials/examplePython/userSmallHDF5_2.py:
Code Block | ||
---|---|---|
| ||
import numpy as np import psana import h5py NUM_EVENTS_TO_WRITE=3 ds = psana.DataSource('exp=xpptut15:run=54:smd') h5out = h5py.File("userSmallData.h5", 'w') saved = h5out.create_dataset('saved',(0,), dtype='f8', chunks=True, maxshape=(None,)) cspad = psana.Detector('cspad', ds.env()) for idx, evt in enumerate(ds.events()): if idx > NUM_EVENTS_TO_WRITE: break calib = cspad.calib(evt) if (calib is None): continue saved.resize((idx+1,)) saved[idx] = np.sum(calib) h5out.close() |
...
A more advanced tutorial on saving data to an hdf5 file can be found on the page: More Advanced Tutorial on Saving Output in Hdf5