Page History
...
Below we assume that everything is setup to work on LCLS analysis farm, otherwise see Computing (including Analysis) and Account Setup.
Libraries
Here is a list of Python libraries which we use in examples below:
...
where item
stands for file, group of dataset.
Check if the HDF5 item is "File", "Group", or "Dataset"
Code Block |
---|
isFile = isinstance(item, h5py.File) isGroup = isinstance(item, h5py.Group) isDataset = isinstance(item, h5py.Dataset) |
In this example the standard Python
method isinstance(...)
returns True
or False
in each case, respectively.
Get information about HDF5 item
- For all HDF5 items:
these parameters are available:Code Block item.id # for example: <GroupID [1] (U) 33554473> item.ref # for example: <HDF5 object reference> item.parent # for example: <HDF5 group "/Configure:0000/Run:0000/CalibCycle:0000" (5 members)> item.file # for example: <HDF5 file "cxi80410-r0587.h5" (mode r, 3.5G)> item.name # for example: /Configure:0000/Run:0000/CalibCycle:0000/Camera::FrameV1
...
- Get the list of daughters in the
group
or convert the group in dictionary and iterate over their key and values,Code Block list_of_item_names = group.items() print list_of_item_names
Code Block for key,val in dict(group).iteritems(): print key, val
Extract time
Time variable is stored in HDF5 as a tuple of two long integer numbers representing the seconds since 01/01/1970 and nanoseconds as a fraction of the second. Time is usually stored in the group attributes and/or in the data record with name "time", which can be extracted as shown below
...
- from the
time
data recordCode Block time_dataset = file['/Configure:0000/Run:0000/CalibCycle:0002/Acqiris::DataDescV1/XppLas.0:Acqiris.0/time'] index = 0 # this is an index in the dataset time_arr = time_dataset[index] # get the time tuple consisting of seconds and nanoseconds time_sec = time_arr[0] time_nsec = time_arr[1]
Code examples
Example 1: Basic operations
Operations with CSPad pedestals
Most generic way to subtract the CSPad pedestals is to use Translator, as described in CsPad calibration in translator. If calibration is requested in the Translator the output HDF5 file has the CSPad image data with already subtracted pedestals. Otherwise, Translator saves raw CSPad data in HDF5 file. If the job execution time is not an issue, the pedestals can be subtracted from raw data directly in code, as explained in this section.
How to find the files with CSPad pedestals
CSPad pedestals are usually calibrated using the "dark" runs. If they were calibrated, the files for appropriate run range, <run-range>.dat
, can be found in the directory
/reg/d/psdm/<INSTRUMENT>/<experiment>/calib/<calib-version>/<source>/pedestals/
If the pedestal file was available at translation time, the dataset
/Configure:0000/CsPad::CalibV1/XppGon.0:Cspad.0/pedestals
is saved in the HDF5 file and can be accessed directly.
One may prefer to calibrate and keep pedestal files in the local directory, as explained below.
How to calibrate CSPad pedestals
If the CSPad pedestals were not calibrated, they can be calibrated, as explained in
the description of the CsPadPedestals psana - Original Documentation module. Essentially, one need to run the psana
for cspad_mod.CsPadPedestals
module, using command
psana -m cspad_mod.CsPadPedestals input-files.xtc
which by default produce two files:
cspad-pedestals.dat
– for average values, andcspad-noise.dat
– for standard deviation values.
These files can be loaded in code as explained below.
Get CSPad pedestal array
The file with pedestal values can be read in code as a numpy array:
Code Block |
---|
import numpy as np
ped_fname = '/reg/d/psdm/<INS>/<experiment>/calib/<calib-version>/<source>/pedestals/<run-range>.dat'
ped_arr = np.loadtxt(ped_fname, dtype=np.float32)
ped_arr.shape = (32, 185, 388) # raw shape is (5920, 388)
|
In this example the pedestal file is loaded from the standard calib
directory. For your own pedestal file the path name should be changed.
Subtract CSPad pedestals
Assuming that the CSPad event array ds1ev
and the pedestal array ped_arr
are available,
the pedestals can be subtracted by the single operation for numpy arrays:
Code Block |
---|
if ds1ev.shape == ped_arr.shape : ds1ev -= ped_arr
|
Note |
---|
This operation will only be valid if the CSPad data array is completely filled (all sensors are available) and its shape is equal to (32, 185, 388). Otherwise, the pedestal subtraction can be done in a loop over available sensors, taking into account the CSPad configuration. |
Code examples
Example 1: Basic operations
Code Block |
---|
#!/usr/bin/env python
import h5py
import numpy as np
eventNumber = 5
file |
Code Block |
#!/usr/bin/env python
import h5py
import numpy as np
eventNumber = 5
file = h5py.File('/reg/d/psdm/XPP/xppcom10/hdf5/xppcom10-r0546.h5', 'r')
dataset = file['/Configure:0000/Run:0000/CalibCycle:0000/Camera::FrameV1/XppSb4Pim.1:Tm6740.1/image']
arr1ev = dataset[eventNumber]
file.close()
print 'arr1ev.shape =', arr1ev.shape
print 'arr1ev =\n', arr1ev
|
...
Code Block |
---|
#!/usr/bin/env python import h5py import time #----------------------------------------------------- def print_time(t_sec, t_nsec): """Converts seconds in human-readable time and prints formatted time""" tloc = time.localtime(t_sec) # converts sec to the tuple struct_time in local print 'Input time :',t_sec,'sec,', t_nsec,'nsec, ' print 'Local time :', time.strftime('%Y-%m-%d %H:%M:%S',tloc) #----------------------------------------------------- file_name = '/reg/d/psdm/xpp/xpp22510/hdf5/xpp22510-r0100.h5' file = h5py.File(file_name, 'r') # open read-only print "EXAMPLE: Get time from the group attributes:" group = file["/Configure:0000"] t_sec = group.attrs.values()[0] t_nsec = group.attrs.values()[1] print_time(t_sec, t_nsec)------ file_name = '/reg/d/psdm/xpp/xpp22510/hdf5/xpp22510-r0100.h5' file = h5py.File(file_name, 'r') # open read-only print "EXAMPLE: Get time from the data record 'time'group attributes:" datasetgroup = file['"/Configure:0000/Run:0000/CalibCycle:0002/Acqiris::DataDescV1/XppLas.0:Acqiris.0/time'] index = 0 time_arr = dataset[ind"] t_sec = time_arr group.attrs.values()[0] t_nsec = time_arrgroup.attrs.values()[1] print_time(t_sec, t_nsec) file.close() #---------------------------------------------------- |
Example 3: Print entire file/group structure using recursive method
_time(t_sec, t_nsec)
print "EXAMPLE: Get time from the data record 'time':"
dataset = file['/Configure:0000/Run:0000/CalibCycle:0002/Acqiris::DataDescV1/XppLas.0:Acqiris.0/time']
index = 0
time_arr = dataset[ind]
t_sec = time_arr[0]
t_nsec = time_arr[1]
print_time(t_sec, t_nsec)
file.close()
#----------------------------------------------------
|
Example 3: Print entire file/group structure using recursive method
Code Block |
---|
#!/usr/bin/env python
import h5py
import sys |
Code Block |
#!/usr/bin/env python import h5py def print_hdf5_file_structure(file_name): """Prints the HDF5 file structure""" file = h5py.File(file_name, 'r') # open read-only item = file #["/Configure:0000/EvrData::ConfigV4"] print_hdf5_item_structure(item) file.close() print '=== EOF ===' def print_hdf5_itemfile_structure(g,offset=' '): """Prints the input file/group/dataset (g) name and begin iterations on its contentfile_name) : """Prints the HDF5 file structure""" file print "Structure of the",= h5py.File(file_name, 'r') # open read-only ifitem = isinstance(g,h5py.File):file #["/Configure:0000/Run:0000"] print "'File'",_hdf5_item_structure(item) elif isinstance(g,h5py.Group):file.close() def print_hdf5_item_structure(g, offset=' print "'Group') from file", : """Prints elif isinstance(g,h5py.Dataset): print "'Dataset' from file", print g.file,"\n",g.namethe input file/group/dataset (g) name and begin iterations on its content""" if isinstance(g,h5py.DatasetFile) : print offset, "(Dateset) len =",print g.shape #, subg.dtype else: file, '(File)', g.name elif isinstance(g,h5py.Dataset) : print_group_content(g,offset) def print_group_content(g,offset=' '): '(Dataset)', g.name, ' """Prints content of the file/group/dataset iteratively, starting from the sub-groups of g"""len =', g.shape #, g.dtype elif isinstance(g,h5py.Group) : for key,val inprint dict'(Group)', g).iteritems():.name else : subg = val print 'WORNING: UNKNOWN ITEM IN printHDF5 offsetFILE', key, #,"g.name ", subg.name #, val, subg.len(), type(subg), sys.exit ( "EXECUTION IS TERMINATED" ) if isinstance(g, h5py.File) or isinstance(subgg, h5py.DatasetGroup) : for key,val in print " (Dateset) len =", subg.shape #, subg.dtype dict(g).iteritems() : elif isinstance(subg, h5py.Group): = val print offset, key, #," (Group) len =",len(subg)", subg.name #, val, subg.len(), type(subg), print_grouphdf5_item_contentstructure(subg, offset + ' ') if __name__ == "__main__" : print_hdf5_file_structure('/reg/d/psdm/XPP/xppcom10/hdf5/xppcom10-r0546.h5') sys.exit ( "End of test" ) |
Example 4: Time-based syncronization of two datasets
...