Page History
...
This mechanism by defaults produces "aligned" datasets where missing values are padded (with NaN's for floats, and -99999 for integers). To create an unaligned dataset (without padding) prefix the name of the variable with "unaligned_".
NOTE: in addition to the hdf5 you specify as your output file ("my.h5" below) you will see other h5 files like "my_part0.h5", one for each of the cores specified in PS_SRV_NODES. The reason for this is that each of those cores writes its own my_partN.h5 file: for LCLS2 it will be important for performance to write many files. The "my.h5" file is actually quite small, and uses a new HDF5 feature called a "Virtual DataSet" (VDS) to join together the various my_partN.h5 files. Also note that events in my.h5 will not be in time order.
Code Block | ||
---|---|---|
| ||
from psana import DataSource import numpy as np # called back on each SRV node, for every smd.event() call below def test_callback(data_dict): print(data_dict) ds = DataSource(exp='tmoc00118', run=123) # batch_size here specifies how often the dictionary of information # is sent to the SRV nodes smd = ds.smalldata(filename='my.h5', batch_size=5, callbacks=[test_callback]) run = next(ds.runs()) # necessary (instead of "None") since some ranks may not receive events # and the smd.sum() below could fail arrsum = np.zeros((2), dtype=np.int) for i,evt in enumerate(run.events()): myones = np.ones_like(arrsum) smd.event(evt, myfloat=2.0, arrint=myones) arrsum += myones if smd.summary: smd.sum(arrsum) smd.save_summary({'summary_array' : arrsum}, summary_int=1) smd.done() |
...