Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

...

Introduction

Presently there are two xtc to hdf5 translators. These are , o2o-translate and psana-translate. o2o-translate is the original translator. It is presently being used in production. The being phased out of use and replaced by psana-translate. Translation is primarily carried out by automatic hdf5 translation that users can execute from the web portal uses o2o-translate. Documentation on o2o-translate, which discusses some history with regards to selecting hdf5 for a scientific data format for general use can be found

...

  • Datasets need not be aligned.  That is the 5th image in a detector dataset may come from a different event than the 5th record in a gas detector dataset. One can match up records from different datasets by use the time datasets.
  • One should use the _mask datasets to identify valid data. A _mask dataset record is 1 when the corresponding record of the data dataset if valid, 0 if it is not. When the _mask record is 0, the data record will be all zeros and should not be processed.
  • The hdf5 group hierarchy has the following levels: run, calib cycle. type, source  - regular event data is organized into datasets that live at the source level, epics has its own place, and configuration data (that usually arrives once) as its own place as well.

...

psana-translate

...

The rest of this document covers psana-translate is part of the analysis release. It will replace o2o-translate . When used for automatic translation as soon as possible. Presently it is BETA software and subject to change – in particular the output schema may change to better accommodate new features.The rest of this document covers psana-translatepsana-translate is backward compatible with o2o-translate save for a few minor differences discussed below. psana-translate runs as a psana module. As such, we have been able to develop several new features that will be discussed below. However the main technical reason for phasing out o2o-translate is to use a Data Description Language (DDL) to generate code that handles the many data types that different detectors produce. This use of DDL is part of psana-translate.

...

The easiest way to try different translator options to write a psana.cfg file.  Copy the file default_psana.cfg that is included below (this is also in or from the Translator package directory ) and modify option values that you wish.  The file default_psana.cfg includes extensive documentation on all the translator options.

...

Important Changes between o2o-translate and psana-translate

Every effort has been made to make the The translation that psana-translate produces is most always backward compatible with what o2o-translate produced. The only difference likely to affect users is where the CsPad calibration constants are found, this is discussed in the The XTC-to-HDF5 Translator section below. There are also a number of minor differences which should make no difference to user code written to process o2o-translate hdf5 files. These are documented in the section Difference's with o2o-translate. Below we document important changes introduced with Schema 4 as implemented in version V00-01-00 and above for psana-translate. o2o-translate . The changes that are more likely to affect user code are discussed below.

Aliases

Aliases are shortcuts scientists can setup for source addresses when configuring data collection with the Data Acquisition system (DAQ). For instance, the alias "evr0" may have been setup for the source address DetInfo(NoDetector.0:Evr.0). DAQ tools and psana can recognize either the alias or the source address. A feature of psana-translate which o2o-translate does not have, is that aliases will be used to create soft links to the source address, hence users of hdf5 will be able to work with either the alias or original source address. However this means there will be additional entries in the hdf5 file that users may have to modify there code to handle.

For example, o2o-translate will put evr config and event data in the groups

Code Block
/Configure:0000/EvrData::ConfigV7/NoDetector.0:Evr.0 
...
/Configure:0000/Run:0000/CalibCycle:0000/EvrData::DataV3/NoDetector.0:Evr.0 

If the alias evr0 has been set up for this source, then psana-translate will produce

Code Block
/Configure:0000/EvrData::ConfigV7/NoDetector.0:Evr.0 
/Configure:0000/EvrData::ConfigV7/evr0   Soft Link to {NoDetector.0:Evr.0}
...
/Configure:0000/Run:0000/CalibCycle:0000/EvrData::DataV3/NoDetector.0:Evr.0
/Configure:0000/Run:0000/CalibCycle:0000/EvrData::DataV3/evr0     Soft Link to {NoDetector.0:Evr.0}

If this poses a problem for updating code written for o2o-tranlsate hdf5 files, note that the hdf5 library provides a way to identify soft links. All high level interfaces (such as h5py, pytables, Matlab, Octave or IDL) should provide this as well.

One can also exclude these soft links by adding

create_alias_links = false

to the \[Translator.H5Output\] section of the psana config file.

Calibrated Data

implemented schema versions 1,2 and 3. These important changes are the use of CalibStore for calibration constants, and dropping PNCCD::FullFrames from translation.

Calibrated Data

o2o-o2o-translate knows how to calibrate CsPad data. If o2o-translate was told where a calib-dir was, and calibration constants were deployed (that is written into this calib-dir) then o2o-translate would calibrate cspad data and write the calibrated data instead of the raw xtc data - in the same place where the raw xtc would have gone. It would also write the calibration data used in a special group, and include the common mode values (if calculated, this depends on what files are deployed to the calib-dir) with the calibrated cspad data. This allows users to recover the raw data from the calibrated data.

psana-translate does not know how to calibrate CsPad or any other data. However if one loads the psana calibration modules before the Translator.H5Output module, these With psana-translate calibration is handled by external psana modules. These modules will produce calibrated data and psana-translate will find it and translate it to the hdf5 file. The calibrated data will be distinguished from uncalibrated data with the use of a key in the event store (the key defaults to 'calibrated' but this is configurable through the psana.cfg file, in the section for the calibration module). psana-translate does know about the key 'calibrated' (again configurable through the psana.cfg file, now the section for Translator.H5Output). If psana-translate sees data with the key calibrated - it defaults to only translate data with the calibrated key and not the raw data (options allow both to be translated, but this will use much more diskspace). psana-translate will also write calibration data found into the a special group. This group is called /Configure:0000/CalibStore.data. In the hdf5 file, one will find calibrated data where one would have otherwise found uncalibrated data. This is consistent with how o2o-translate translated calibrated cspad data. The calibrated key is not present in the hdf5 path names. This is different than what one finds for keys with ndarrays. For ndarrays the key is part of the h5 path name (see below).  The psana-translate option skip_calibrated can be set to true to get the uncalibrated data instead of calibrated data.

Calibration makes use of calibration constants - such as pedestals and pixel status. A key difference between psana-translate and o2o-translate is where these calibration constants are found, and the datatypes used to store them. For psana-translate there are found in the group CalibStore to the current configure groupA key difference however, is that psana-translate does not put calibrated data in the same group as uncalibrated data would go, it creates a new source level group name using its rules for combining a source and a key (concatenating with two underscores between them). For example, if we translate the first event in a run of the cxi tutorial data where we add the cspad calibration module before the psana-translate module:

psana -n 1 -m cspad_mod.CsPadCalib,Translator.H5Output -o Translator.H5Output.output_file=calib.h5 exp=xpptut13:run=71

And then examine the output We Then we will see

Code Block
h5ls -r calib.h5 | grep -i "calibstore\|cspad"  # this command will include the following output

/Configure:0000/Run:0000/CalibCycle:0000/CsPad2x2::ElementV1/XppGon.0:Cspad2x2.0__calibrated/common_mode Dataset {1/Inf}
/Configure:0000/Run:0000/CalibCycle:0000/CsPad2x2::ElementV1/XppGon.0:Cspad2x2.0__calibrated/data Dataset {1/Inf}
/Configure:0000/Run:0000/CalibCycle:0000/CsPad2x2::ElementV1/XppGon.0:Cspad2x2.0__calibrated/element Dataset {1/Inf}
...
/Configure:0000/Run:0000/CalibCycle:0000/CsPad2x2::ElementV1/XppGon.0:Cspad2x2.1__calibrated/common_mode Dataset {1/Inf}
/Configure:0000/Run:0000/CalibCycle:0000/CsPad2x2::ElementV1/XppGon.0:Cspad2x2.1__calibrated/data Dataset {1/Inf}
/Configure:0000/Run:0000/CalibCycle:0000/CsPad2x2::ElementV1/XppGon.0:Cspad2x2.1__calibrated/element Dataset {1/Inf}
...
/Configure:0000/CalibStore/pdscalibdata::CsPad2x2PedestalsV1/XppGon.0:Cspad2x2.0/pedestals Dataset {185, 388, 2}
/Configure:0000/CalibStore/pdscalibdata::CsPad2x2PedestalsV1/XppGon.0:Cspad2x2.1/pedestals Dataset {185, 388, 2}
/Configure:0000/CalibStore/pdscalibdata::CsPad2x2PixelStatusV1/XppGon.0:Cspad2x2.0/status Dataset {185, 388, 2}
/Configure:0000/CalibStore/pdscalibdata::CsPadCommonModeSubV1/XppGon.0:Cspad2x2.0/data Dataset {SCALAR}
...

Things to note:

...

  • There are common_mode datasets included with the data
  • Both cspad sources have a pedestal dataset in CalibStore
  • Only XppGon.0:Cspad2x2.0 has a common mode dataset in the calibstore.

...

An issue users may run into is understanding what calibration was done and recovering the raw data just from examining the hdf5 output. In the case of cspad, an understanding of the CsPadCalib module along with the what is in the hdf5 file does allow one to recover the uncalibrated data. This may not be possible with future calibration modules and detectors, in particular if nonlinear calibration algorithms are applied, such as applying a threshold. It is also important to note that CsPadCalib - if it does not find any calibration constants, will still put cspad data in the event store with the key 'calibrated', however it will not be calibrated - nothing will be done to it. In this case psana-translate will still create group names with __calibrated added to it, but it will be the uncalibrated data. One would not expect to see any pedestals datasets in the CalibStore in this case.

...