Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
Introduction

The Translator Package implements the Xtc to Hdf5 translation is performed by a psana module H5Output. This translates Psana events into an hdf5 files. We shall call this system psana-translate (and it can be run via a command line wrapper by that name).  psana-translate is meant to replace .  Previously translation was performed by an external tool: o2o-translate.  o2o-translate is part of the package O2OTranslator which has been responsible for translating LCLS xtc files to hdf5 to date. The main reason for developing the new translator is to take advantage of DDL code generation for translating the many xtc data types into hdf5 datasets (implemented between the psddldata, and psddl_hdf2psana packages). Developing the new translator as a psana module has also made it easy to support features such as event selection and translation of user data. In addition, as a psana module, this translator now shares all of psana's code for parsing xtc files.
Documentation on O2OTranslator, which discusses some history with regards to selecting hdf5 for a scientific data format for general use can be found

Translator

User Interface to Translator

This documentation also contains important links to the Interface Controller

 

User's Guide

Input and Output Data Formats

A discussion of the input and output formats for the translator can be found here:

Event data format
Scientific data format

Running the Translator

There are three ways to run the translator

...

One runs psana-translate as

   psana-translate [psana arguments] --output_file=h5outfile [optional H5Output options] [--save_cfg=filename]

For example

...

Documentation on O2OTranslator, which discusses some history with regards to selecting hdf5 for a scientific data format for general use can be found

Translator
User Interface to Translator

these documentation also contains important links to the Interface Controller which manages automatic translation.

A discussion of the input and output formats for translator can be found here:

Event data format - the input format
Scientific data format - the output format

Below we discuss the new translator, psana-translate. The input and output formats have not changed between o2o-translate and psana-translate.

 

Psana-translate User's Guide

Running the Translator

 

There are three ways to run the translator

  • Through the Interface Controller (once it has been configured to run psana-translate rather than o2o-translate)
  • As a psana module, either through psana command line options or writing a psana configuration file.  The module is Translator.H5Output.
  • Through the command line wrapper psana-translate to Translator.H5Output

The only option that is required to give to the Translator is the name of the output file.  This must be a fully qualified filename, with the output directory.  For example:

psana -m Translator.H5Output -o Translator.H5Output.output_file=/reg/d/psdm/instrument/experiment/hdf5/exp-run001.h5 /reg/d/psdm/

...

instrument/

...

experiment/xtc/

...

exp-run001*.xtc

Would invoke the translator.  It will translate all the xtc files in run 001. This runs with default values for all the translator options. These are the recommended option values to use for translation.  The options include gzip compression at level 1 and no filtering on events or data.

If you are going to use many of the translator options, it will be easier write a command line using psana-translate - a command line wrapper to the Translator.H5Output module. psana-translate is run as

   psana-translate [psana arguments] --output_file=h5outfile [optional H5Output options] [--save_cfg=filename]

For example

   psana-translate -v -v -n 5 /reg/d/psdm/mec/mec01/xtc/e01-r001-s0*-c0*.xtc --output_file=output.h5 --store_epics=no --Epics=exclude Control=exclude src_filter="exclude NoDetector.0.Evr.2"

would use psana options to get debugging output and only read the first 5 events from the mec01 run1 files. Then it uses translator options to exclude all epics data, as well as the Control types (Psana types that start with ControlData::ConfigV) and it excludes data coming from the src NoDetector.0.Evr.2).  Note the use of double quotes to specify the multiword value for the src_filter option.

The easiest way to try different translator options to write a psana.cfg file.  Copy the file default_psana.cfg that is included below (this is also in the Translator package directory) and modify option values that you wish.  The file default_psana.cfg includes extensive documentation on all the translator optionsr001-s0*-c0*.xtc --output_file=output.h5would use psana options to get debugging output and only read the first 5 events from the mec01 run1 files. 
Run psana-translate for more help on using the wrapper - for details on all of the translation options, see below.

New Features

With psana-translate, you can

...

Since psana-translate runs as a psana module, it is possible to filter translated events through psana options and other modules. psana options allow you to start at a certain event, and process a certain number of events.  Moreover a user module that is loaded before the Translator module can tell psana that it should not pass this event on to any other modules, hence the H5Output will never see the event and it will not get translated.

psana-translate also provides a C++ interface to filtering that will record the event times of the filtered messages, as well as a user log message as to why the event was filtered.  See will record the event times of the filtered messages, as well as a optional user log message (to record a note as to why the event was filtered).  A C++ module that is loaded before Translator.H5Ouput would do

Code Block
languagecpp
titleTranslator do not translate example
collapsetrue
#include "Translator/doNotTranslate.h"

// define user Module,

virtual void event(Event &evt, Env &env) {
 Translator::doNotTranslateEvent(evt, std::string("the beam energy is to low"));
}

 

For more information see the function doNotTranslate and the example class TestDoNotTranslate.  Using this function will cause a group 'filtered' to be created in each CalibCycle where events are filtered.  The filtered group will include the datasets 'time' (with the even id's) of the filtered events and a 'data' dataset with the log messages.

Filtering Types

...

would cause any of the types Psana::Bld::BldDataEBeamV0, Psana::Bld::BldDataEBeamV1, Psana::Bld::BldDataEBeamV2, Psana::Bld::BldDataEBeamV3 or Psana::Bld::BldDataEBeamV4 to be excluded from translation.  See below See the section Psana Configuration File and all Options for more details.

Src Filtering

...

src_filter = exclude NoDetector.0:Evr.2  CxiDs1.0:Cspad.0  CxiSc2.0:Cspad2x2.1  EBeam  FEEGasDetEnergy  CxiDg2_Pim

again, see see the section Psana Configuration File and all Options below for more details.

Writing NDArrays and Strings

ndarrays (up to dimension 4 of the standard integral and float types) and std::string's that are written into the event store will be written to the hdf5 by default.  These events can be filtered as well.  See below the section Psana Configuration File and all Options for more details.

Psana Configuration File and all Options


When running the translator as a psana module, if is often convenient to create a psana.cfg file.  The Translator package include
the file default_psana.cfg which is a psana configuration file that describes all the options possible, with extensive documentation
as to what they mean.  Below we include this file for reference:

...

  • configStore - only undamaged data is stored in the configStore
  • EventStore - undamaged data, and EBeam data with user damage is stored in the event, all other damage is not stored

The translator always records event ids and damage for any xtc data that psana processes, but it only translates data passes psana's damage policy. So by default, damaged config objects, and damaged events (other then user damaged EBeam data) are not translated. This deviates slightly from what o2o-translate would dotranslate.  o2o-translate would also store out of order damaged event data.  There is a psana option that can be added to the [psana] section of the .cfg file to recover this behavior.  Below we document some special options that control what damaged data psana stores:

  • store-out-of-order-damage  - defaults to false, set to true if you want to translate out of order damaged data
  • store-user-ebeam-damage  - defaults to true, set to false if you do not want to translate EBeam data what that only has user damage
  • store-damaged-config - defaults to false, set to true if you want to store damaged config data

...

  • File attributes runNumber, runType and experiment not stored, instead expNum, experiment, instrument and jobName are stored (from the psana Env object)
  • The attribute :schema:timestamp-format is always "full", there is no option for "short"
  • The output file must be explicitly specificed in the psana cfg file. It is not inferred from the input.
  • The File attribute origin is now psana-translator as opposed to translator
  • The end sec and nanoseconds are not written into the Configure group at the end of the job as there is no EventId in the Event at the end.
  • integer size changes - a number of fields have changed size, a few examples are below.  In one quirky case, this caused translation to be different.  The reason was that the data was uninitialized, and the new 32 bit value was different than the old 16 bit value. Beam line data Data produced from 2014 onward will not include unitialized data in the translation, users will not have to worry about.  Unitialized data is very rare in pre 2014 data and, due to its location, not likely to be used in analysis.
  • A few Examples of field size changes:
    • EvrData::ConfigV7/seq_config - sync_source - enum was uint16, now uint32
    • EvrData::ConfigV7/seq_config - beam_source - enum was uint16, now uint32
    • Ipimb::DataV2 - source_id was uint16, now uint8
    • Ipimb::DataV2 - conn_id was uint16 now uint8
    • Ipimb::DataV2 - module was uint16, now uint8

...

Some DAQ config objects include space for a maximum number of entries.  o2o-translate would only write entries for those used, not the maximum entries.  The psana translator does not.  For example:--

  • The Acqiris::ConfigV1 vert dataset now always prints the max of 20 channels, even if the user will only be using 3.

      ...

        • Note, in this case the Acqiris data will still only include the 3 channels being used. o2o-translate was making an adjustment to the config data being written.

      psana-translate will write an emtpy output_lookup_table for Opal1k::ConfigV1 output_lookup_table, even if output_lookup_table() is enabled.  o2o-translate would not.

      ...

      As discussed above, OutOfOrder Damage - is not translated by default, . o2o-translate translated out of order damage, however psana-translate does not.  psana can be told to include this kind of damaged data by setting store-out-of-order-damage=true in the [psana] section of your .cfg file.

      ...