Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This mechanism by defaults produces "aligned" datasets where missing values are padded (with NaN's for floats, and -99999 for integers).  To create an unaligned dataset (without padding) prefix the name of the variable with "unaligned_".

NOTE: in addition to the hdf5 you specify as your output file ("my.h5" below) you will see other h5 files like "my_part0.h5", one for each of the cores specified in PS_SRV_NODES.  The reason for this is that each of those cores writes its own my_partN.h5 file: for LCLS2 it will be important for performance to write many files.  The "my.h5" file is actually quite small, and uses a new HDF5 feature called a "Virtual DataSet" (VDS) to join together the various my_partN.h5 files.  Also note that events in my.h5 will not be in time order.

Code Block
languagepy
from psana import DataSource
import numpy as np

# called back on each SRV node, for every smd.event() call below
def test_callback(data_dict):
    print(data_dict)
ds = DataSource(exp='tmoc00118', run=123)
# batch_size here specifies how often the dictionary of information
# is sent to the SRV nodes
smd = ds.smalldata(filename='my.h5', batch_size=5, callbacks=[test_callback])
run = next(ds.runs())

# necessary (instead of "None") since some ranks may not receive events
# and the smd.sum() below could fail
arrsum = np.zeros((2), dtype=np.int)
for i,evt in enumerate(run.events()):
    myones = np.ones_like(arrsum)
    smd.event(evt, myfloat=2.0, arrint=myones)
    arrsum += myones

if smd.summary:
    smd.sum(arrsum)
    smd.save_summary({'summary_array' : arrsum}, summary_int=1)
smd.done()

...