Page History
...
The example below specifies a chunk size of 2048 elements for the small data, and 12 elements for the large. Each large element is about 9MB, so each chunk of the large dataset is about 100MB. If you run this example over a large number of events, you will notice it takes slightly longer to process every 12th event. This is when a chunk of the large dataset gets filled and flushed to disk. If you run the example as it is, over the 3 events, you will notice that the output file is quite large, about 100MB - that is Hdf5 does not write partial chunks - only complete chunks. This script lives in /reg/g/psdm/tutorials/examplePython/userSmallHDF5_2userLargeHDF5.py:
Code Block | ||
---|---|---|
| ||
import numpy as np import psana import h5py NUM_EVENTS_TO_WRITE=3 ds = psana.DataSource('exp=xpptut15:run=54:smd') h5out = h5py.File("userData.h5", 'w') smallDataSet = h5out.create_dataset('cspad_sums',(0,), dtype='f8', chunks=(2048,), maxshape=(None,)) largeDataSet = h5out.create_dataset('cspads',(0,32,185,388), dtype='f4', chunks=(12,32,185,388), maxshape=(None,32,185,388)) cspad = psana.Detector('cspad', ds.env()) for idx, evt in enumerate(ds.events()): if idx > NUM_EVENTS_TO_WRITE: break calib = cspad.calib(evt) if calib is None: continue smallDataSet.resize((idx+1,)) largeDataSet.resize((idx+1,32,185,388)) smallDataSet[idx] = np.sum(calib) largeDataSet[idx,:] = calib[:] h5out.close() |
...
Overview
Content Tools