Saving User "Small Data" in Hdf5

This script lives in /reg/g/psdm/tutorials/examplePython/userSmallHDF5.py.

We do not assume we know the final size of the dataset. The example demonstrates using the hdf5 chunked storage and resize functions to grow a dataset. The same sort of pattern can be used to write in-memory data at the end of a run, but it is easier because at that time the size of the dataset it known.

import numpy as np
import psana
import h5py

NUM_EVENTS_TO_WRITE=3

ds = psana.DataSource('exp=xpptut15:run=54:smd')

h5out = h5py.File("userSmallData.h5", 'w')
saved = h5out.create_dataset('saved',(0,), dtype='f8', chunks=True, maxshape=(None,))

cspad = psana.Detector('cspad', ds.env())

for idx, evt in enumerate(ds.events()):
    if idx > NUM_EVENTS_TO_WRITE: break
    calib = cspad.calib(evt)
    if (calib is None): continue
    saved.resize((idx+1,))
    saved[idx] = np.sum(calib)

h5out.close()

Good tools to inspect an h5 files are h5ls and h5dump. For example, doing:

 h5ls -d -r userSmallData.h5

shows the dataset and its values:

[cpo@psana1511]$ h5ls -d -r userSmallData.h5 
/                        Group
/saved                   Dataset {4/Inf}
    Data:
        (0) 23773.12109375, 135712.25, 65513.67578125, 16749.18359375

A more advanced tutorial on saving data to an hdf5 file can be found on the page: More Advanced Tutorial on Saving Output in Hdf5

Page tree

Saving User "Small Data" in Hdf5