Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Datasets need not be aligned.  That is the 5th image in a detector dataset may come from a different event than the 5th record in a gas detector dataset. One can match up records from different datasets by use the time datasets.
  • One should use the _mask datasets to identify valid data. A _mask dataset record is 1 when the corresponding record of the data dataset if valid, 0 if it is not. When the _mask record is 0, the data record will be all zeros and should not be processed. The mask is 0 when the xtc data is damaged. The type damaged data can then be found in the _damage dataset. The main reason to record damaged data is to keep datasets as aligned as possible.
  • The hdf5 group hierarchy has the following levels: run, calib cycle. type, source  - regular event data is organized into datasets that live at the source level. Epics is special, epics has its own place, and rather then the two groups type and source, there are three groups for epics: type, source and epics pv name. Epics aliases live alongside epics pv names in this group hierarchy. Finally configuration data (that usually arrives once) as its own place as wellis found in subgroups to the configure groups at the top of the hdf5 hierarchy.

psana-translate

The rest of this document covers psana-translate. When used for automatic translation psana-translate is backward compatible with o2o-translate save for a few minor differences discussed below. psana-translate runs as a psana runs as a psana module. As such, we have been able to develop several new features that will be discussed below. However the main technical reason for phasing out o2o-translate is to use a Data Description Language (DDL) to generate code that handles the many data types that different detectors produce. This use of DDL is part of psana-translate.

...

psana -m Translator.H5Output -o Translator.H5Output.output_file=/reg/d/psdm/instrument/experiment/hdf5/exp-run001.h5 /reg/d/psdm/instrument/experiment/xtc/exp-run001*.xtc

...

  • filter out whole events from translation
  • filter out certain data, by data type, or by data source, or key string
  • write ndarray's that other modules add to the event store
  • write std::string's that other C++ modules add to the event store
  • advanced: have a C++ module register a new type for translation

Some aspects of these new features are subject to change. This will be discussed below. Any future changes in this regard will not affect automatic translation.

Important Changes between o2o-translate and psana-translate

The translation that psana-translate produces is most always backward compatible with what o2o-translate produced. The only difference likely to affect users is where the CsPad calibration constants are found, this is discussed in the The XTC-to-HDF5 Translator section below. There are also a number of minor differences which should make no difference to user code written to process o2o-translate hdf5 files. These are documented in the section Difference's with o2o-translate. hdf5 files created by o2o-translate or psana-translate contain an attribute defining the schema number. Below we document important changes introduced with Schema 4 as implemented in version V00-01-00 and above for psana-translate. o2o-translate implemented schema versions 1,2 and 3. These important changes are the use of CalibStore for calibration constants, and dropping PNCCD::FullFrames from translation.

...

o2o-translate knows how to calibrate CsPad data. If o2o-translate was told where a calib-dir was , (which it is for automatic translation) and calibration constants were deployed (that is written into this calib-dirhave been recorded in this directory (typically carried out by the calibration management tool by processing a dark run) then o2o-translate would calibrate calibrates cspad data and write the calibrated data instead of the raw xtc data - . It writes the calibrated data in the same place where the raw xtc would have gone. It would also write the calibration data used in a special group, and include the common mode values (if calculated, constants used (such as pedestals and pixel status) in a special group. Finally, if the common mode calibration was done (this depends on what files are deployed to the calib-dir) with the calibrated cspad datawhich is a correction calculated for each event, the source group containing all the event data will include a common_mode dataset with the common mode values. This allows users to recover the raw data from the calibrated data.

With psana-translate calibration is handled by external psana modules. These modules will produce calibrated data and psana-translate will find it and translate it to the hdf5 file. Understanding this flow of data is not necessary for automatic translation, however if users want to customize calibration, some understanding of how psana modules pass data through the event store, and are configured through config files is necessary.  The calibrated data will be distinguished from uncalibrated data with the use of a key in the event store (the . The key defaults to the value 'calibrated' but this is configurable through the psana.cfg file, in the section for the calibration module )used. psana-translate does know about the key 'calibrated' (again configurable provides special treatment for the calibration key. For psana-translate, the default value for the calibration key is 'calibrated' as well, but again, this is configurable through the psana.cfg file, now in the section for Translator.H5Output). If psana-translate sees data with the key calibrated - it defaults to only translate data with the calibrated key and not the raw data. In the hdf5 file, one will find calibrated data where one would have otherwise found uncalibrated data. This is consistent with how o2o-translate translated calibrated cspad data. The calibrated key is not present in the hdf5 path names. This is different than what one finds for keys with ndarrays. For ndarrays the key is part of the h5 path name (see below).  The psana-translate option skip_calibrated can be set to true to get the uncalibrated data instead of calibrated data.

...

An issue users may run into is understanding what calibration was done and recovering the raw data just from examining the hdf5 output. In the case of cspad, an understanding of the CsPadCalib module along with the what is in the hdf5 file does allow one to recover the uncalibrated data. This may not be possible with future other calibration modules and detectors, in particular if nonlinear calibration algorithms are applied, such as applying a threshold. It is also important to note that CsPadCalib - if it does not find any calibration constants, will still put cspad data in the event store with the key 'calibrated', however it will not be calibrated - nothing will be done to it. One would not expect to see any pedestals datasets in the CalibStore in this case.

PNCCD::FullFrame

This data is no longer translated. FullFrame is a copy of Frames with a more convenient interface. User's interested in having FullFrame written into their hdf5 files rather than the original Frames data should make a feature request.

...

Note the src level group names: noSrc__mesage and noSrc__measurements. Since no source was specified with the calls to evt.put, the Translator starts with the string noSrc in the group name. Two underscores, __, separate the source from the keystring.

Filtering from Python Modules

Warning

This example illustrates the way our current hdf5 schema, schema 4, forms hdf5 paths that involve key strings for event data: source__key where the string noSrc can be used for source. This is the aspect of the new features that is subject to change.

Filtering from Python Modules

A Python module can use standard psana features to skip events as discussed above. It can also add any Python object into the event store that has the key A Python module can use standard psana features to skip events as discussed above. It can also add any Python object into the event store that has the key "do_not_translate". This will create the Filtered:0000/time dataset as above. However to use the Translator filtering features that record user data, the Python module will have to add data that psana knows how to convert for C++ modules. Presently the only types that a Python module can add to the event store which will be seen by C++ modules are a number of ndarrays. A Python module will need to add one of these ndarray types to filter events, the data of the ndarray will be recorded in the hdf5 file.

...

  • /Configure:0000/Run:0000/CalibCycle:0000/MyData/example
    • Note how the C++ type name, MyData, shows up in the path. 
    • Next the 'src' level group is based on the key "example" passed when putting myData in the event store.
  • The dataset: /Configure:0000/Run:0000/CalibCycle:0000/MyData/example/data
    • The name "data" comes from the 2nd parameter to the HdfWriterNew object.
    • The dataset will be a 1D array of the hdf5 compound type with the fields
      • "eventCount"  uint32
      • "energy"  float
Warning

The interface to registering a new writer is subject to change.

 

Psana Configuration File and all Options


When running the translator as a psana module, if is often convenient to create a psana.cfg file.  The Translator package include
the file default_psana.cfg which is a psana configuration file that describes all the options possible, with extensive documentation
as to what they mean.  Below we include this file for reference:. To use this file, one could it and modify it. However it is not necessary to take the whole file - every value set is set to the default value. One could simply use this as a reference for those options values that one wants to change.

######################################################################
[psana]

# MODULES: any modules that ######################################################################
[psana]
# MODULES: any modules that produce data to be translated need be loaded
# **BEFORE** Translator.H5Output (such as calibrated data or NDArray's)
# event data added by modules listed after Translator.H5Output is not translated.
modules = Translator.H5Output

files = **TODO: SPECIFY INPUT FILES OR DATA SOURCE HERE**

######################################################################
[Translator.H5Output]

# The only option you need to set for the Translator.H5Output module is
# output_file. All other options have default values (explained below).

# TODO: enter the full h5 output file name, including the output directory
output_file = output_directory/h5output.h5

# By default, the Translator will not overwrite the h5 file if it already exists
overwrite = false

# # # # # # # # # # # # # # # # # # # # #
# EPICS FILTERING
# The Translator can store epics pv's in one of two ways, or not at all.
# set store_epics below, to one of the following:
#
# updates_only  only stores an epic pv when it has changed. The pv is stored
#               # in the current calib cycle.  For mutli calib cycle experiments,
#               # users may have to look back through several calib cycle's to
#               # find the latest value of a pv.
#
# calib_repeat  repeat each calib cycle will include the latest value of all the epics
#                pv's.  This can make it easier to find # pv's. This can make it easier to find pv's for a calib cycle.
#               # For experiments with many short calib cycles, it produces
#               # many more datasets than neccessary.
#
# no             no epics pv's will not be stored. You may also want to set Epics=exclude
#               # (see below) if you do not want the epics configuration data stored

# The default is 'calib_repeat'

store_epics = calib_repeat

# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# FILTERING
#
# By default, all xtc data is Translated and many ndarrays that user modules (if any)
# add are translated. Filtering can occur in either the code of user modules, or
# through options in the psana.cfg file. Here in the config file, different groups of
# data can be filtered. There are four options for filtering data:
#
#   # type filtering filtering   -  for example, exclude all cspad, regardless of the detector source
#   # source filtering -  for example, exclude any data from a given detector source
#   # key filtering filtering    -  for example, include only ndarrays with a given key string
#   # calibration      calibration - do not translate original xtc if a calibrated version is found
#
# Type filtering is based on sets of Psana data types. If you know what detectors or
# devices to filter, leave type filtering alone and go to src_filter.
#
# Type filtering has the highest precedence, then key filtering, then source
# filtering, and lastly calibration filtering. When the Translator sees new data,
# it first checks the type filter. If it is a filtered type (or unknown type) no further
# translation occurs with the data - regardless of src or key. For data that gets
# past the type filter, the Translator looks at the src and key. If the key
# string is empty, it checks the source filter. Data with non empty key strings are
# handled via the key filter. If the src is filtered, but the key is not, then the
# data will be translated. Data with the special calibration key string are handled
# via the
# calibration filtering.
#
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# TYPE FILTERING
#
# One can include or exclude a class of Psana types with the following
# options. Only the strings include or exclude are valid for these
# type filtering options.
#
# Note - Epics in the list below refers only to the epicsConfig data
# which is the epics alias list, not the epics pv's. To filter the epics pv's
# see the 'store_epics' option above.

AcqTdc = include               include # Psana::Acqiris::TdcConfigV1, Psana::Acqiris::TdcDataV1
AcqWaveform = include          include # Psana::Acqiris::ConfigV1, Psana::Acqiris::DataDescV1
Alias = include                include # Psana::Alias::ConfigV1
Andor = include                include # Psana::Andor::ConfigV1, Psana::Andor::FrameV1
Arraychar = include include            # Psana::Arraychar # Psana::Arraychar::DataV1
Control = include              include # Psana::ControlData::ConfigV1, Psana::ControlData::ConfigV2, Psana::ControlData::ConfigV3
Cspad = include                include # Psana::CsPad::ConfigV1, Psana::CsPad::ConfigV2, Psana::CsPad::ConfigV3, Psana::CsPad::ConfigV4, Psana::CsPad::ConfigV5, Psana::CsPad::DataV1, Psana::CsPad::DataV2
Cspad2x2 = include             include # Psana::CsPad2x2::ConfigV1, Psana::CsPad2x2::ConfigV2, Psana::CsPad2x2::ElementV1
DiodeFex = include             include # Psana::Lusi::DiodeFexConfigV1, Psana::Lusi::DiodeFexConfigV2, Psana::Lusi::DiodeFexV1
EBeam = include                include # Psana::Bld::BldDataEBeamV0, Psana::Bld::BldDataEBeamV1, Psana::Bld::BldDataEBeamV2, Psana::Bld::BldDataEBeamV3, Psana::Bld::BldDataEBeamV4, Psana::Bld::BldDataEBeamV5
Encoder = include              include # Psana::Encoder::ConfigV1, Psana::Encoder::ConfigV2, Psana::Encoder::DataV1, Psana::Encoder::DataV2
Epics = include                include # Psana::Epics::ConfigV1
Epix = include include                 # Psana # Psana::Epix::ConfigV1, Psana::Epix::ElementV1
EpixSampler = include          include # Psana::EpixSampler::ConfigV1, Psana::EpixSampler::ElementV1
Evr = include                  include # Psana::EvrData::ConfigV1, Psana::EvrData::ConfigV2, Psana::EvrData::ConfigV3, Psana::EvrData::ConfigV4, Psana::EvrData::ConfigV5, Psana::EvrData::ConfigV6, Psana::EvrData::ConfigV7, Psana::EvrData::DataV3
EvrIO = include                include # Psana::EvrData::IOConfigV1
Evs = include                  include # Psana::EvrData::SrcConfigV1
FEEGasDetEnergy = include include      # Psana::Bld::BldDataFEEGasDetEnergy
Fccd = include                 include # Psana::FCCD::FccdConfigV1, Psana::FCCD::FccdConfigV2
Fli = include include                  # Psana::Fli::ConfigV1, Psana:: # Psana::Fli::ConfigV1, Psana::Fli::FrameV1
Frame = include                include # Psana::Camera::FrameV1
FrameFccd = include            include # Psana::Camera::FrameFccdConfigV1
FrameFex = include             include # Psana::Camera::FrameFexConfigV1
GMD = include                  include # Psana::Bld::BldDataGMDV0, Psana::Bld::BldDataGMDV1
Gsc16ai = include              include # Psana::Gsc16ai::ConfigV1, Psana::Gsc16ai::DataV1
Imp = include                  include # Psana::Imp::ConfigV1, Psana::Imp::ElementV1
Ipimb = include               include # # Psana::Ipimb::ConfigV1, Psana::Ipimb::ConfigV2, Psana::Ipimb::DataV1, Psana::Ipimb::DataV2
IpmFex = include               include # Psana::Lusi::IpmFexConfigV1, Psana::Lusi::IpmFexConfigV2, Psana::Lusi::IpmFexV1
L3T = include                 include # Psana::L3T::ConfigV1, Psana::L3T::DataV1
OceanOptics = include          include # Psana::OceanOptics::ConfigV1, Psana::OceanOptics::ConfigV2, Psana::OceanOptics::DataV1, Psana::OceanOptics::DataV2
Opal1k = include               include # Psana::Opal1k::ConfigV1
Orca = include                 include # Psana::Orca::ConfigV1
Partition = include           include # Psana::Partition::ConfigV1
PhaseCavity = include          include # Psana::Bld::BldDataPhaseCavity
PimImage = include            include # Psana: # Psana::Lusi::PimImageConfigV1
Pimax = include                include # Psana::Pimax::ConfigV1, Psana::Pimax::FrameV1
Princeton = include            include # Psana::Princeton::ConfigV1, Psana::Princeton::ConfigV2, Psana::Princeton::ConfigV3, Psana::Princeton::ConfigV4, Psana::Princeton::ConfigV5, Psana::Princeton::FrameV1, Psana::Princeton::FrameV2
PrincetonInfo = include        include # Psana::Princeton::InfoV1
Quartz = include               include # Psana::Quartz::ConfigV1
Rayonix = include              include # Psana::Rayonix::ConfigV1, Psana::Rayonix::ConfigV2
SharedAcqADC = include         include # Psana::Bld::BldDataAcqADCV1
SharedIpimb = include          include # Psana::Bld::BldDataIpimbV0, Psana::Bld::BldDataIpimbV1
SharedPim = include            include # Psana::Bld::BldDataPimV1
Spectrometer = include include         # Psana:: # Psana::Bld::BldDataSpectrometerV0
TM6740 = include               include # Psana::Pulnix::TM6740ConfigV1, Psana::Pulnix::TM6740ConfigV2
Timepix = include              include # Psana::Timepix::ConfigV1, Psana::Timepix::ConfigV2, Psana::Timepix::ConfigV3, Psana::Timepix::DataV1, Psana::Timepix::DataV2
TwoDGaussian = include         include # Psana::Camera::TwoDGaussianV1
UsdUsb = include               include # Psana::UsdUsb::ConfigV1, Psana::UsdUsb::DataV1
pnCCD = include                include # Psana::PNCCD::ConfigV1, Psana::PNCCD::ConfigV2, Psana::PNCCD::FramesV1

# user types to translate from the event store
ndarray_types = include        include # ndarray<int8_t,1>, ndarray<int8_t,2>, ndarray<int8_t,3>, ndarray<int8_t,4>, ndarray<int16_t,1>, ndarray<int16_t,2>, ndarray<int16_t,3>, ndarray<int16_t,4>, ndarray<int32_t,1>, ndarray<int32_t,2>, ndarray<int32_t,3>, ndarray<int32_t,4>, ndarray<int64_t,1>, ndarray<int64_t,2>, ndarray<int64_t,3>, ndarray<int64_t,4>, ndarray<uint8_t,1>, ndarray<uint8_t,2>, ndarray<uint8_t,3>, ndarray<uint8_t,4>, ndarray<uint16_t,1>, ndarray<uint16_t,2>, ndarray<uint16_t,3>, ndarray<uint16_t,4>, ndarray<uint32_t,1>, ndarray<uint32_t,2>, ndarray<uint32_t,3>, ndarray<uint32_t,4>, ndarray<uint64_t,1>, ndarray<uint64_t,2>, ndarray<uint64_t,3>, ndarray<uint64_t,4>, ndarray<float,1>, ndarray<float,2>, ndarray<float,3>, ndarray<float,4>, ndarray<double,1>, ndarray<double,2>, ndarray<double,3>, ndarray<double,4>, ndarray<const int8_t,1>, ndarray<const int8_t,2>, ndarray<const int8_t,3>, ndarray<const int8_t,4>, ndarray<const int16_t,1>, ndarray<const int16_t,2>, ndarray<const int16_t,3>, ndarray<const int16_t,4>, ndarray<const int32_t,1>, ndarray<const int32_t,2>, ndarray<const int32_t,3>, ndarray<const int32_t,4>, ndarray<const int64_t,1>, ndarray<const int64_t,2>, ndarray<const int64_t,3>, ndarray<const int64_t,4>, ndarray<const uint8_t,1>, ndarray<const uint8_t,2>, ndarray<const uint8_t,3>, ndarray<const uint8_t,4>, ndarray<const uint16_t,1>, ndarray<const uint16_t,2>, ndarray<const uint16_t,3>, ndarray<const uint16_t,4>, ndarray<const uint32_t,1>, ndarray<const uint32_t,2>, ndarray<const uint32_t,3>, ndarray<const uint32_t,4>, ndarray<const uint64_t,1>, ndarray<const uint64_t,2>, ndarray<const uint64_t,3>, ndarray<const uint64_t,4>, ndarray<const float,1>, ndarray<const float,2>, ndarray<const float,3>, ndarray<const float,4>, ndarray<const double,1>, ndarray<const double,2>, ndarray<const double,3>, ndarray<const double,4>
std_string = include           include # std::string


# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# TYPE FILTER SHORTCUT
#
# In addition to filtering Psana types by the options above, one can use
# the type_filter option below. For example:
#
# type_filter include cspad       cspad # will only translate cspad types. Will not translate
#                                 # ndarrays or strings
# type_filter exclude Andor evr   # # ndarrays or strings
# type_filter exclude Andor evr # translate all except the Andor or Evr types
#
# If you do not want to translate what is in the xtc file, use the psana shortcut:
#
# type_filter exclude psana      psana # This will only translate ndarray's and strings
#
# Likewise doing:
#
# type_filter include psana psana       # will translate all xtc data, but skip any ndarray's or strings
#
# The default is to include all

type_filter include all

# note - if type_filter is anything other than 'include all' it takes precedence
# over the classes of type filter options above, like Cspad=include.

# # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# SOURCE FILTERING
#
# The default for the src_filter option is "include all"
# If you want to include a subset of the sources, do
#
# src_filter include srcname1 srcname2  
#
# or if you want to exclude a subset of sources, do
#
# src_filter exclude srcname1 srcname2
#
# The syntax for specifying a srcname follows that of the Psana Source (discussed in
# the Psana Users Guide). The Psana Source recognizes DAQ alias names (if present
# in the xtc files), several styles for specifying a Pds Src, as well as detector matches
# where the detector number, or device number is not known.
#
# Specifically, format of the match string can be:
#
#      # DetInfo(det.detId:dev.devId) - fully or partially specified DetInfo
#      # det.detId:dev.devId - same as above
#      # DetInfo(det-detId|dev.devId) - same as above
#      # det-detId|dev.devId - same as above
#      # BldInfo(type) - fully or partially specified BldInfo
#      # type - same as above
#      # ProcInfo(ipAddr) - fully or partially specified ProcInfo
#
# For example
#       # DetInfo(AmoETOF.0.Acqiris.0)  
#       
# DetInfo(AmoETOF.0.Acqiris)  
#       
# DetInfo(AmoETOF:Acqiris)
#       # AmoETOF:Acqiris
#       # AmoETOF|Acqiris
#
# will all match the same data, AmoETOF.0.Acqiris.0. The later ones will match
# additional data (such as detector 1, 2, etc.) if it is present.
#
# A simple way to set up src filtering is to take a look at the sources in the
# xtc input using the psana EventKeys module.  For example
#
# psana -n 5 -m EventKeys exp=cxitut13:run=22
#
# Will print the EventKeys in the first 5 events.  If the output includes
#
#  # EventKey(type=Psana::EvrData::DataV3, src=DetInfo(NoDetector.0:Evr.2))
#  # EventKey(type=Psana::CsPad::DataV2, src=DetInfo(CxiDs1.0:Cspad.0))
#  # EventKey(type=Psana::CsPad2x2::ElementV1, src=DetInfo(CxiSc2.0:Cspad2x2.1))
#  # EventKey(type=Psana::Bld::BldDataEBeamV3, src=BldInfo(EBeam))
#  # EventKey(type=Psana::Bld::BldDataFEEGasDetEnergy, src=BldInfo(FEEGasDetEnergy))
#  # EventKey(type=Psana::Camera::FrameV1, src=BldInfo(CxiDg2_Pim))
#
# Then one can filter on these six srcname's:
#
# NoDetector.0:Evr.2 CxiDs1.0:Cspad.0 CxiSc2.0:Cspad2x2.1  EBeam  FEEGasDetEnergy 1 EBeam FEEGasDetEnergy CxiDg2_Pim
#

src_filter = include all

# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# CALIBRATION FILTERING
#
# Psana calibration modules can produce calibrated versions of different
# data types. Depending on the module used, you may get an NDArray, an
# image, or the same data type as was in the xtc but with calibrated data.
#
# If you are doing the latter, the module output will be data of the same type
# and src as the uncalibrated data, with an additional key, such as 'calibrated'.
#
# The Translator defaults If these modules are configured to skippinguse thea translationdifferent ofkey, theset uncalibratedcalibration_key
# data when a calibrated version of that data is present.  Below you
# can control the calibration key and whether or not to include the
# uncalibrated data.
calibration_key = calibrated
include_uncalibrated_data = false
exclude_calibrated_databelow accordingly:

calibration_key = calibrated

# The Translator defaults to writing calibrated data in place of uncalibrated
# data. If you do not want the calibrated data, set skip_calibrated to true.

skip_calibrated = false

# Sometimesnote, thesetting skip_calibrated datato istrue notwill theforce endsets result desired, it mayexclude_calibstore
# (below) to be inputtrue toas well.

# another# module,# such# as# an# image# producer.# In# this# case# you# may# want# to# set
# exclude_calibrated_data=true
# Note:# this# only# affects# calibrated# data# of# the# same# type# and# src# as# the
# uncalibrated# data. # When# the# calibration# module# produces#
# a NDArray, bothCALIBSTORE FILTERING
#
# Calibration themodules NDArraymay andpublish the uncalibrated data arethey translated. used Ifto youproduce dothe not wishcalibrated
# toevent translateobjects. theExamples uncalibratedof data, usewould appropriate type or src_filter options.be pedestal values, pixel status (what
# Likewisepixels ifare youhot) doand notcommon wantmode toalgorithm translateparameters. certain NDArray's, see the This data will be published
# key_filterin options below.
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# CALIBSTORE FILTERING
#
# Calibration modules may publish the data they used to produce the calibrated
# event objects. Examples of data would be pedestal values, pixel status (what
# pixels are hot) and common mode algorithm parameters. This data will be published
# in what is called the Psana calibStore. When the Translator sees calibrated
# event data, it will look for the corresponsinding calibStore data as well.
# If you do not want it to translate calibStore data, set the following to true.
exclude_calibstore = false
# otherwise, the Translator will create a group CalibStore that holds the
# calibstore data. Note, the Translator looks for all calibStore data associated
# with the calibration modules. If a calibration module was configured to not do
# certain calibrations (such as gain) but the module still put gain values
# in the config store (even though it did not use them) the Translator
# would still translate those gain values.
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# KEY FILTERING
#
# Psana modules loaded before the translator may put a variety of objects in the event
# store. Be default, the Translator will translate any new data that it knows about.
# In addition to the psana types, it knows about NDArrays, C++ strings, and has a C++ interface
# for registering new simple types. NDarray's up to 4 dimensions of 10 basic types
# (8, 16, 32 and 64 bit signed and unsigned int, float and double) as well as the const
# versions of these types are translated.
#
# Generally Psana modules will attach keys to these objects (the keys are simply strings).
# To filter the set of keys that are translated, modify the parameter below:
key_filter = include all
# The default is to not look at the key but rather translate all data that the translator
# knows about. An example of including only data with the key finalanswer would be
#
# key_filter = include finalanswer
#
# To exclude a few keys, one can do
#
# key_filter = exclude arrayA arrayB
#
# Note, key filtering does not affect translation of data without keys. For instance
# setting key_filter = include keyA does not turn off translation of data without keys.
# Of all the data with keys, only those where the key is keyA will be translated.
#
# # what is called the Psana calibStore. When the Translator sees calibrated
# event data, it will look for the corresponsinding calibStore data as well.
# If you do not want it to translate calibStore data, set the following to true.

exclude_calibstore = false

# otherwise, the Translator will create a group CalibStore that holds the
# calibstore data. Note, the Translator looks for all calibStore data associated
# with the calibration modules. If a calibration module was configured to not do
# certain calibrations (such as gain) but the module still put gain values
# in the config store (even though it did not use them) the Translator
# would still translate those gain values.

# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# KEY FILTERING
#
# Psana modules loaded before the translator may put a variety of objects in the event
# store. Be default, the Translator will translate any new data that it knows about.
# In addition to the psana types, it knows about NDArrays, C++ strings, and has a C++ interface
# for registering new simple types. NDarray's up to 4 dimensions of 10 basic types
# (8, 16, 32 and 64 bit signed and unsigned int, float and double) as well as the const
# versions of these types are translated.
#
# Generally Psana modules will attach keys to these objects (the keys are simply strings).
# To filter the set of keys that are translated, modify the parameter below:

key_filter = include all

# The default is to not look at the key but rather translate all data that the translator
# knows about. An example of including only data with the key finalanswer would be
#
# key_filter = include finalanswer
#
# To exclude a few keys, one can do
#
# key_filter = exclude arrayA arrayB
#
# Note, key filtering does not affect translation of data without keys. For instance
# setting key_filter = include keyA does not turn off translation of data without keys.
# Of all the data with keys, only those where the key is keyA will be translated.
#
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# COMPRESSION
#
# The following options control compression for most all datasets.
# Shuffling improves compression for certain datasets. Valid values for
# deflate (gzip compression level) are 0-9. Setting deflate = -1 turns off
# compression.

shuffle = true
deflate = 1

# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# COMPRESSION
#
# TheTECHNICAL, following options control compression for most all datasets.
# Shuffling improves compression for certain datasets. Valid values for
# deflate (gzip compression level) are 0-9. Setting deflate = -1 turns off
# compression.
shuffle = true
deflate = 1
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# DAQ ALIAS LINKS
#
# When DAQ aliases exist in the xtc files, the Translator will create links
# using the alias names to the src hdf5 group. For example if one has an
# alias such as:
# acq01 -> SxrEndstation.0:Acqiris.0
# and one has Acqiris::DataDescV1 coming from this source, the h5 file will
# contain the link:
#  Acqiris::DataDescV1/acq01 -> Acqiris::DataDescV1/SxrEndstation.0:Acqiris.0
# so that one does not have to use the full src to access the data.
#
# To turn this feature off, set create_alias_links to false
create_alias_links = true
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# TECHNICAL, ADVANCED CONFIGURATION
#
# --------ADVANCED CONFIGURATION
#
# ---------------------------------------
# CHUNKING
# The commented options below give the default chunking options.
# Objects per chunk are selected from the target chunk size (16 MB) and
# adjusted based on min/max objects per chunk, and the max bytes per chunk.
# It is important that the chunkCache (created on a per dataset basis) be
# large enough to hold at least one chunk, ideally all chunks we need to have
# open at one time when writing to the dataset (usually one, unless we repair
# split events):
 
# chunkSizeTargetInBytes = 1703936 (16MB)
# chunkSizeTargetObjects = 0 (0 means select objects per chunk from chunkSizeInBytes)
# maxChunkSizeInBytes = 10649600 10649600  (100MB)
# minObjectsPerChunk = 50              50
# maxObjectsPerChunk = 2048
# chunkCacheSizeTargetInChunks = 3
# maxChunkCacheSizeInBytes = 10649600 10649600  (100MB)

# ---------------------------------------
# REFINED DATASET CONTROL
#
# There are six classes of datasets for which individual options for shuffle,
# deflate, chunkSizeTargetInBytes and chunkSizeTargetObjects can be specified:
#
# regular (most everything, all psana types)
# epics (all the epics pv's)
# damage (accompanies all regular data from event store)
# ndarrays (new data from other modules)
# string's (new data from other modules)
# eventId (the time dataset that also accompanies all regular data, epics pvs, ndarrays and strings)
#
# The options for regular datasets have been discussed above. The other five datasets
# get their default values for shuffle, deflate, chunkSizeInBytes and chunkSizeInObjects
# from the regular dataset options except in the cases below:
 
# damageShuffle = false
# stringShuffle = false
# epicsPvShuffle = false
# stringDeflate = -1
# eventIdChunkSizeTargetInBytes = 16384
# epicsPvChunkSizeTargetInBytes = 16384 ndarrays and strings)
#
# The restoptions offor theregular shuffle,datasets deflatehave andbeen chunkdiscussed size options for theabove. The other five datasets are:
#
# get eventIdShuffletheir =default true
#values eventIdDeflatefor = 1
# damageDeflate = 1shuffle, deflate, chunkSizeInBytes and chunkSizeInObjects
# epicsPvDeflatefrom =the 1
#regular ndarrayShuffledataset =options true
#except ndarrayDeflatein =the 1
# eventIdChunkSizeTargetObjects = 0
# damageChunkSizeTargetInBytes = 1703936cases below:

# damageChunkSizeTargetObjectsdamageShuffle = 0false
# stringChunkSizeTargetInBytesstringShuffle = 1703936false
# stringChunkSizeTargetObjectsepicsPvShuffle = 0false
# ndarrayChunkSizeTargetInBytesstringDeflate = 1703936-1
# ndarrayChunkSizeTargetObjectseventIdChunkSizeTargetInBytes = 016384
# epicsPvChunkSizeTargetObjectsepicsPvChunkSizeTargetInBytes = 016384

# ---------------------------------------
# SPLIT EVENTS
# When the Translator encounters a split event, it checks a cache to see
# if it has already seen it.  If it has, it fills in any blanks that it can.
# To prevent this cache from growing to large, set the maximum number of
# split events to look back through here (default is 3000):
max_saved_split_events = 3000The rest of the shuffle, deflate and chunk size options for the other five datasets are:
#
# eventIdShuffle = true
# eventIdDeflate = 1
# damageDeflate = 1
# epicsPvDeflate = 1
# ndarrayShuffle = true
# ndarrayDeflate = 1
# eventIdChunkSizeTargetObjects = 0
# damageChunkSizeTargetInBytes = 1703936
# damageChunkSizeTargetObjects = 0
# stringChunkSizeTargetInBytes = 1703936
# stringChunkSizeTargetObjects = 0
# ndarrayChunkSizeTargetInBytes = 1703936
# ndarrayChunkSizeTargetObjects = 0
# epicsPvChunkSizeTargetObjects = 0

# ---------------------------------------
# HDF5SPLIT GROUP NAMESEVENTS
# TheWhen typenamesthe forTranslator beamencounters linea datasplit defaultsevent, toit beingchecks writtena ascache (for example) to see
# Bld::BldDataEBeamV0. Setting short_bld_name to true causes it to be
# written as BldDataEBeamV0. If set to true, names are written differently
# then with o2o-translate and the change may break code that reads h5 files
# (such as psana)
short_bld_name = false if it has already seen it. If it has, it fills in any blanks that it can.
# To prevent this cache from growing to large, set the maximum number of
# split events to look back through here (default is 3000):

max_saved_split_events = 3000

# ---------------------------------------
# HDF5 FILE PROPERTIES
#
# split large files, presently we only support NoSplit. Future options may be: Family and SplitScan
# for future splitting, splitSize defaults to 10 GB
split = NoSplit
splitSize = 10737418240

...

  • configStore - only undamaged data is stored in the configStore
  • EventStore - undamaged data, and EBeam data with user damage is stored in the event, all other damage is not stored

The translator always psana-translate records event ids and damage for any xtc data that psana processes, but it only translates data passes psana's damage policy. So by default, damaged config objects, and damaged events (other then user damaged EBeam data) are not translated. This deviates slightly from what o2o-translate would translate.  o2o-translate would also store out of order damaged event data.  There is a psana option that can be added to the [psana] section of the .cfg file to recover this behavior.  Below we document some special options that control what damaged data psana stores:

...

Here we cover differences with o2o-translate not discussed abovethat we expect will be minor and not affect user code.

Feature's Dropped from o2o-translate

hdf file creation parameters
Only NoSplit is implemented - no family or split drivers.

In general a number of o2o-translate options are no longer supported.  In particular:
-G (long names like CalibCycle:0000 instead of CalibCycle) is always on.
Signficant Translation differences:PNCCD::FullFrame data is no longer translated. FullFrame is a copy of Frames with a more convenient interface. User's interested in having FullFrame written into their hdf5 files rather than the original Frames data should make a feature request.

Speed

psana-translate runs about 10% slower than o2o-translate does.

Performance testing was done during November/December of 2013.  Both o2o-translate and psana-translate worked through a 92 GB xtc file using compression=1 on the rhat6 machine psdev105.  They read and wrote the data from /u1. They both used the non-parallel compression library.  o2o-translate produced a 68GB file in 65 minutes and psana-translate produced a 65GB file in 70 minutes.  (Speeds of about 22MB/sec).  Production runs will use the parallel compression library and are expected to run at faster speeds (about 50MB/sec).

...

Below is a list of technical differences between psana-translate and o2o-translate. These differences should not affect end users.

  • File attributes runNumber, runType and experiment not stored, instead expNum, experiment, instrument and jobName are stored (from the psana Env object)
  • The attribute :schema:timestamp-format is always "full", there is no option for "short"
  • The output file must be explicitly specificed in the psana cfg file. It is not inferred from the input.
  • The File attribute origin is now psana-translator as opposed to translator
  • The end sec and nanoseconds are not written into the Configure group at the end of the job as there is no EventId in the Event at the end.
  • integer size changes - a number of fields have changed size, a few examples are below.  In one quirky case, this caused translation to be different.  The reason was that the data was uninitialized, and the new 32 bit value was different than the old 16 bit value. Data produced from 2014 onward will not include unitialized data in the translation, users will not have to worry about.  Unitialized data is very rare in pre 2014 data and, due to its location, not likely to be used in analysis.
  • A few Examples of field size changes:
    • EvrData::ConfigV7/seq_config - sync_source - enum was uint16, now uint32
    • EvrData::ConfigV7/seq_config - beam_source - enum was uint16, now uint32
    • Ipimb::DataV2 - source_id was uint16, now uint8
    • Ipimb::DataV2 - conn_id was uint16 now uint8
    • Ipimb::DataV2 - module was uint16, now uint8

...

Only one epics pv is stored per name (of course, one epics pv may have any number of elements within it). This is fine as the epic pv name is supposed to uniquely identify the pv.  However in xtc files, you can see several epics pv's with the same pvname, but different pvid's. This scenario should only arise when the same pv is coming from different sources, and replicates the same data.  Psana only stores one epics pv per name (the last one it sees in a datagram). This is the one that the translator psana-translate will pick up and store.

All Epics pv's are stored in the source folder EpicsArch.0:NoDevice.0.  With o2o-translate, some could be split off into other folders (such as AmoVMI.0:Opal1000.0). As epics pv names uniquely identify the data, the source information should not be needed.Typenames that started with Bld::Bld can be shortened to start with just Bld, but they default to stay as Bld::Bld (set short_bld_names = false in the psana.cfg to shorten these names, but this may break existing code that reads .h5 files).

Some DAQ config objects include space for a maximum number of entries.  o2o-translate would only write entries for those used, not the maximum entries.  The psana translator -translate does not.  For example:

  • The Acqiris::ConfigV1 vert dataset now always prints the max of 20 channels, even if the user will only be using 3.
    • Note, in this case the Acqiris data will still only include the 3 channels being used. o2o-translate was making an adjustment to the config data being written.

psana-translate will write an emtpy output_lookup_table for the Opal1k::ConfigV1 dataset named output_lookup_table, even if output_lookup_table() is enabled.  o2o-translate would not.

...