Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Introduction

Presently there are two xtc to hdf5 translators, o2o-translate and psanaTranslation from the LCLS data format of XTC to the more general scientific format  HDF5 is carried out by the XTC to HDF5 Translator. We refer to the software that carries out the translation as psana-translate. This distinguishes it from the first version of the software called o2o-translate. o2o-translate is the original translator. It is being phased out of use and replaced by psana-translate. Translation is primarily carried out by automatic hdf5 translation that users can execute from the web portal. Documentation on o2o-translateno longer being maintained. While it is still possible to run o2o-translate, it will not understand the latest types of data used in LCLS experiments. Translation for experiments should generally be carried out by using the automatic hdf5 translation feature of the Web Portal of Experiments. Below features are discussed that allow for customization of the translation - filtering unwanted events or detectors as well as adding processed data. The web portal provides a mechanism to customize the translation using these features. Experiment POC's should be able to help with using these features. Documentation on o2o-translate, which discusses some history with regards to selecting hdf5 for a scientific data format for general use can be found

...

psana -m Translator.H5Output -o Translator.H5Output.output_file=exp-run001.h5 /reg/d/psdm/instrument/experiment/xtc/exp-run001*.xtcexp=xpptut13:run=71

would invoke the translator.  It will translate all the xtc files in run 00171 of the xpptut13 dataset. This runs with default values for all the translator options. These are the recommended option values to use for translation.  The options include gzip compression at level 1 and no filtering on events or data. The Translator does not overwrite an existing h5 output file by default (set the option overwrite=true to overwrite the output).

...

Starting in release ana-0.12.1, the Translator will support supports split scan mode. In this mode, each calib cycle will be written into a separate hdf5 file. A master file will have external links to the separate calib cycle files. Users need only work with the master file. The master file uses the same schema as one finds without split scan mode, with one exception discussed below (the logging of filtered events).. Otherwise, no modification to users code is required when working with the master file. Not all experiments use more than one calib cycle. For experiments that use one calib cycle per run, split scan mode provides no benefit. Two reasons to use split scan mode is first, that the resulting hdf5 file from normal translation is too large, and second, to parallelize the translation and make it faster.

...

HDF5 presently has little support for reading a file that is being created, and in general does not recommend this. However the master file is written in a way to support this as well as possible. Most all links from the master file to calib cycle files are not created until the calib cycle file is finished. The exception is the last N links, where N is the number of jobs running. These links may be written before the calib cycle files they link to are finished. To see updates in the master file, users may need to shut down programs like Matlab and h5py and restart them. It is not sufficient to close and reopen the master file within a Python or Matlab session.

 

 

New Features

With psana-translate, you can

Translation differences with split scan mode

The only difference users should see is if they provide modules that use the special key 'do_not_translate' to drop events from Translation. Ordinarily, as discussed below, in addition to dropping the event from translation, a event id for the dropped event is recorded in a hdf5 group such as /Configure:0000/Run:0000/Filtered:0000. These groups are not created in split scan mode (the event will still be be dropped).

New Features

With psana-translate, you can

  • filter out whole events from translation
  • filter
  • filter out whole events from translation
  • filter out certain data, by data type, or by data source, or key string
  • write ndarray's that other modules add to the event store
  • write std::string's that other C++ modules add to the event store
  • advanced: have a C++ module register a new type for translation

...

An issue users may run into is understanding what calibration was done and recovering the raw data just from examining the hdf5 output. In the case of cspad, an understanding of the CsPadCalib module along with the what is in the hdf5 file does allow one to recover the uncalibrated data. This may not be possible with other calibration modules and detectors, in particular if nonlinear calibration algorithms are applied, such as applying a thresholdof cspad, an understanding of the CsPadCalib module along with the what is in the hdf5 file does allow one to recover the uncalibrated data. This may not be possible with other calibration modules and detectors, in particular if nonlinear calibration algorithms are applied, such as applying a threshold.

Filtering Calibrated Data

By default, when the Translator sees both the original xtc data and a calibrated version of it, it writes the calibrated data in place of the xtc data. For automatic translation, we always load the module that will calibrate cspad if possible. If users do not want the calibrated data but prefer the raw data, they can set the option

skip_calibrated = true

If the user wants neither the calibrated nor the raw cspad, they they should use type filters, that is setting

Cspad = exclude
Cspad2x2 = exclude

There will be no cspad in the translation (including the cspad configuration data as well as event data). This can be useful if one is using other modules to produce ndarrays from the calibrated data and one only wants the final processed ndarrays in the translation.

PNCCD::FullFrame

This data is no longer translated. FullFrame is a copy of Frames with a more convenient interface for psana users, it is not considered to be as useful for hdf5 files. User's interested in having FullFrame written into their hdf5 files rather than the original Frames data should make a feature request.

...

One would use this to only translate user module data, such as ndarrays, strings and newly registered types.

The complete list of type aliases that users can use to filter is found in the default_psana.cfg file included below.

Src Filtering

Specific src's can be filtered by providing a list such as

...

Code Block
languagebash
titlepsana-translate default_psana.cfg file - all options
collapsetrue
######################################################################
[psana]

# MODULES: any modules that produce data to be translated need be loaded 
# **BEFORE** Translator.H5Output (such as calibrated data or NDArray's)
# event data added by modules listed after Translator.H5Output is not translated.
modules = Translator.H5Output

files = **TODO: SPECIFY INPUT FILES OR DATA SOURCE HERE**

######################################################################
[Translator.H5Output]

# The only option you need to set for the Translator.H5Output module is
# output_file. All other options have default values (explained below).

# TODO: enter the full h5 output file name, including the output directory
output_file = output_directory/h5output.h5

# By default, the Translator will not overwrite the h5 file if it already exists
overwrite = false

# # # # # # # # # # # # # # # # # # # # #
# EPICS FILTERING
# The Translator can store epics pv's in one of two ways, or not at all.
# set store_epics below, to one of the following:
#
# updates_only   stores an epic pv when it has changed. The pv is stored 
#                in the current calib cycle.  For mutli calib cycle experiments, 
#                users may have to look back through several calib cycle's to 
#                find the latest value of a pv.
#
# calib_repeat   each calib cycle will include the latest value of all the epics 
#                pv's.  This can make it easier to find pv's for a calib cycle. 
#                For experiments with many short calib cycles, it produces 
#                many more datasets than neccessary.
#
# no             epics pv's will not be stored. You may also want to set Epics=exclude
#                (see below) if you do not want the epics configuration data stored

# The default is 'calib_repeat'

store_epics = calib_repeat

# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# FILTERING
# 
# By default, all xtc data is Translated and many ndarrays that user modules (if any) 
# add are translated. Filtering can occur in either the code of user modules, or
# through options in the psana.cfg file. Here in the config file, different groups of 
# data can be filtered. There are four options for filtering data: 
#
#    type filtering   -  for example, exclude all cspad, regardless of the detector source
#    source filtering -  for example, exclude any data from a given detector source
#    key filtering    -  for example, include only ndarrays with a given key string
#    calibration      -  do not translate original xtc if a calibrated version is found
#
# Type filtering is based on sets of Psana data types. If you know what detectors or 
# devices to filter, leave type filtering alone and go to src_filter. 
#
# Type filtering has the highest precedence, then key filtering, then source 
# filtering, and lastly calibration filtering. When the Translator sees new data, 
# it first checks the type filter. If it is a filtered type (or unknown type) no further 
# translation occurs with the data - regardless of src or key. For data that gets 
# past the type filter, the Translator looks at the src and key. If the key 
# string is empty, it checks the source filter. Data with non empty key strings are 
# handled via the key filter. If the src is filtered, but the key is not, then the
# data will be translated. Data with the special calibration key string are handled 
# via the calibration filtering. 
#
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# TYPE FILTERING 
#
# One can include or exclude a class of Psana types with the following 
# options. Only the strings include or exclude are valid for these 
# type filtering options. 
# 
# Note - Epics in the list below refers only to the epicsConfig data
# which is the epics alias list, not the epics pv's. To filter the epics pv's
# see the 'store_epics' option above.

AcqTdc = include               # Psana::Acqiris::TdcConfigV1, Psana::Acqiris::TdcDataV1
AcqWaveform = include          # Psana::Acqiris::ConfigV1, Psana::Acqiris::DataDescV1
Alias = include                # Psana::Alias::ConfigV1
Andor = include                # Psana::Andor::ConfigV1, Psana::Andor::FrameV1
Arraychar = include            # Psana::Arraychar::DataV1
Control = include              # Psana::ControlData::ConfigV1, Psana::ControlData::ConfigV2, Psana::ControlData::ConfigV3
Cspad = include                # Psana::CsPad::ConfigV1, Psana::CsPad::ConfigV2, Psana::CsPad::ConfigV3, Psana::CsPad::ConfigV4, Psana::CsPad::ConfigV5, Psana::CsPad::DataV1, Psana::CsPad::DataV2
Cspad2x2 = include             # Psana::CsPad2x2::ConfigV1, Psana::CsPad2x2::ConfigV2, Psana::CsPad2x2::ElementV1
DiodeFex = include             # Psana::Lusi::DiodeFexConfigV1, Psana::Lusi::DiodeFexConfigV2, Psana::Lusi::DiodeFexV1
EBeam = include                # Psana::Bld::BldDataEBeamV0, Psana::Bld::BldDataEBeamV1, Psana::Bld::BldDataEBeamV2, Psana::Bld::BldDataEBeamV3, Psana::Bld::BldDataEBeamV4, Psana::Bld::BldDataEBeamV5
Encoder = include              # Psana::Encoder::ConfigV1, Psana::Encoder::ConfigV2, Psana::Encoder::DataV1, Psana::Encoder::DataV2
Epics = include                # Psana::Epics::ConfigV1
Epix = include                 # Psana::Epix::ConfigV1, Psana::Epix::ElementV1
EpixSampler = include          # Psana::EpixSampler::ConfigV1, Psana::EpixSampler::ElementV1
Evr = include                  # Psana::EvrData::ConfigV1, Psana::EvrData::ConfigV2, Psana::EvrData::ConfigV3, Psana::EvrData::ConfigV4, Psana::EvrData::ConfigV5, Psana::EvrData::ConfigV6, Psana::EvrData::ConfigV7, Psana::EvrData::DataV3
EvrIO = include                # Psana::EvrData::IOConfigV1
Evs = include                  # Psana::EvrData::SrcConfigV1
FEEGasDetEnergy = include      # Psana::Bld::BldDataFEEGasDetEnergy
Fccd = include                 # Psana::FCCD::FccdConfigV1, Psana::FCCD::FccdConfigV2
Fli = include                  # Psana::Fli::ConfigV1, Psana::Fli::FrameV1
Frame = include                # Psana::Camera::FrameV1
FrameFccd = include            # Psana::Camera::FrameFccdConfigV1
FrameFex = include             # Psana::Camera::FrameFexConfigV1
GMD = include                  # Psana::Bld::BldDataGMDV0, Psana::Bld::BldDataGMDV1
Gsc16ai = include              # Psana::Gsc16ai::ConfigV1, Psana::Gsc16ai::DataV1
Imp = include                  # Psana::Imp::ConfigV1, Psana::Imp::ElementV1
Ipimb = include                # Psana::Ipimb::ConfigV1, Psana::Ipimb::ConfigV2, Psana::Ipimb::DataV1, Psana::Ipimb::DataV2
IpmFex = include               # Psana::Lusi::IpmFexConfigV1, Psana::Lusi::IpmFexConfigV2, Psana::Lusi::IpmFexV1
L3T = include                  # Psana::L3T::ConfigV1, Psana::L3T::DataV1
OceanOptics = include          # Psana::OceanOptics::ConfigV1, Psana::OceanOptics::ConfigV2, Psana::OceanOptics::DataV1, Psana::OceanOptics::DataV2
Opal1k = include               # Psana::Opal1k::ConfigV1
Orca = include                 # Psana::Orca::ConfigV1
Partition = include            # Psana::Partition::ConfigV1
PhaseCavity = include          # Psana::Bld::BldDataPhaseCavity
PimImage = include             # Psana::Lusi::PimImageConfigV1
Pimax = include                # Psana::Pimax::ConfigV1, Psana::Pimax::FrameV1
Princeton = include            # Psana::Princeton::ConfigV1, Psana::Princeton::ConfigV2, Psana::Princeton::ConfigV3, Psana::Princeton::ConfigV4, Psana::Princeton::ConfigV5, Psana::Princeton::FrameV1, Psana::Princeton::FrameV2
PrincetonInfo = include        # Psana::Princeton::InfoV1
Quartz = include               # Psana::Quartz::ConfigV1
Rayonix = include              # Psana::Rayonix::ConfigV1, Psana::Rayonix::ConfigV2
SharedAcqADC = include         # Psana::Bld::BldDataAcqADCV1
SharedIpimb = include          # Psana::Bld::BldDataIpimbV0, Psana::Bld::BldDataIpimbV1
SharedPim = include            # Psana::Bld::BldDataPimV1
Spectrometer = include         # Psana::Bld::BldDataSpectrometerV0
TM6740 = include               # Psana::Pulnix::TM6740ConfigV1, Psana::Pulnix::TM6740ConfigV2
Timepix = include              # Psana::Timepix::ConfigV1, Psana::Timepix::ConfigV2, Psana::Timepix::ConfigV3, Psana::Timepix::DataV1, Psana::Timepix::DataV2
TwoDGaussian = include         # Psana::Camera::TwoDGaussianV1
UsdUsb = include               # Psana::UsdUsb::ConfigV1, Psana::UsdUsb::DataV1
pnCCD = include                # Psana::PNCCD::ConfigV1, Psana::PNCCD::ConfigV2, Psana::PNCCD::FramesV1

# user types to translate from the event store
ndarray_types = include        # ndarray<int8_t,1>, ndarray<int8_t,2>, ndarray<int8_t,3>, ndarray<int8_t,4>, ndarray<int16_t,1>, ndarray<int16_t,2>, ndarray<int16_t,3>, ndarray<int16_t,4>, ndarray<int32_t,1>, ndarray<int32_t,2>, ndarray<int32_t,3>, ndarray<int32_t,4>, ndarray<int64_t,1>, ndarray<int64_t,2>, ndarray<int64_t,3>, ndarray<int64_t,4>, ndarray<uint8_t,1>, ndarray<uint8_t,2>, ndarray<uint8_t,3>, ndarray<uint8_t,4>, ndarray<uint16_t,1>, ndarray<uint16_t,2>, ndarray<uint16_t,3>, ndarray<uint16_t,4>, ndarray<uint32_t,1>, ndarray<uint32_t,2>, ndarray<uint32_t,3>, ndarray<uint32_t,4>, ndarray<uint64_t,1>, ndarray<uint64_t,2>, ndarray<uint64_t,3>, ndarray<uint64_t,4>, ndarray<float,1>, ndarray<float,2>, ndarray<float,3>, ndarray<float,4>, ndarray<double,1>, ndarray<double,2>, ndarray<double,3>, ndarray<double,4>, ndarray<const int8_t,1>, ndarray<const int8_t,2>, ndarray<const int8_t,3>, ndarray<const int8_t,4>, ndarray<const int16_t,1>, ndarray<const int16_t,2>, ndarray<const int16_t,3>, ndarray<const int16_t,4>, ndarray<const int32_t,1>, ndarray<const int32_t,2>, ndarray<const int32_t,3>, ndarray<const int32_t,4>, ndarray<const int64_t,1>, ndarray<const int64_t,2>, ndarray<const int64_t,3>, ndarray<const int64_t,4>, ndarray<const uint8_t,1>, ndarray<const uint8_t,2>, ndarray<const uint8_t,3>, ndarray<const uint8_t,4>, ndarray<const uint16_t,1>, ndarray<const uint16_t,2>, ndarray<const uint16_t,3>, ndarray<const uint16_t,4>, ndarray<const uint32_t,1>, ndarray<const uint32_t,2>, ndarray<const uint32_t,3>, ndarray<const uint32_t,4>, ndarray<const uint64_t,1>, ndarray<const uint64_t,2>, ndarray<const uint64_t,3>, ndarray<const uint64_t,4>, ndarray<const float,1>, ndarray<const float,2>, ndarray<const float,3>, ndarray<const float,4>, ndarray<const double,1>, ndarray<const double,2>, ndarray<const double,3>, ndarray<const double,4>
std_string = include           # std::string


# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# TYPE FILTER SHORTCUT
#
# In addition to filtering Psana types by the options above, one can use
# the type_filter option below. For example:
#
# type_filter include cspad       # will only translate cspad types. Will not translate
#                                 # ndarrays or strings
# type_filter exclude Andor evr   # translate all except the Andor or Evr types
# 
# If you do not want to translate what is in the xtc file, use the psana shortcut:
#
# type_filter exclude psana       # This will only translate ndarray's and strings 
#
# Likewise doing:
#
# type_filter include psana       # will translate all xtc data, but skip any ndarray's or strings
#
# The default is to include all

type_filter include all

# note - if type_filter is anything other than 'include all' it takes precedence
# over the classes of type filter options above, like Cspad=include.

# # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# SOURCE FILTERING
#
# The default for the src_filter option is "include all"
# If you want to include a subset of the sources, do
#
# src_filter include srcname1 srcname2  
#
#  or if you want to exclude a subset of sources, do
#
# src_filter exclude srcname1 srcname2
#
# The syntax for specifying a srcname follows that of the Psana Source (discussed in 
# the Psana Users Guide). The Psana Source recognizes DAQ alias names (if present
# in the xtc files), several styles for specifying a Pds Src, as well as detector matches 
# where the detector number, or device number is not known.
# 
# Specifically, format of the match string can be:
#
#       DetInfo(det.detId:dev.devId) - fully or partially specified DetInfo
#       det.detId:dev.devId - same as above
#       DetInfo(det-detId|dev.devId) - same as above
#       det-detId|dev.devId - same as above
#       BldInfo(type) - fully or partially specified BldInfo
#       type - same as above
#       ProcInfo(ipAddr) - fully or partially specified ProcInfo
#
# For example
#        DetInfo(AmoETOF.0.Acqiris.0)  
#        DetInfo(AmoETOF.0.Acqiris)  
#        DetInfo(AmoETOF:Acqiris)
#        AmoETOF:Acqiris
#        AmoETOF|Acqiris
#
# will all match the same data, AmoETOF.0.Acqiris.0. The later ones will match
# additional data (such as detector 1, 2, etc.) if it is present.
#
# A simple way to set up src filtering is to take a look at the sources in the 
# xtc input using the psana EventKeys module.  For example
#
# psana -n 5 -m EventKeys exp=cxitut13:run=22 
#
# Will print the EventKeys in the first 5 events.  If the output includes
#
#   EventKey(type=Psana::EvrData::DataV3, src=DetInfo(NoDetector.0:Evr.2))
#   EventKey(type=Psana::CsPad::DataV2, src=DetInfo(CxiDs1.0:Cspad.0))
#   EventKey(type=Psana::CsPad2x2::ElementV1, src=DetInfo(CxiSc2.0:Cspad2x2.1))
#   EventKey(type=Psana::Bld::BldDataEBeamV3, src=BldInfo(EBeam))
#   EventKey(type=Psana::Bld::BldDataFEEGasDetEnergy, src=BldInfo(FEEGasDetEnergy))
#   EventKey(type=Psana::Camera::FrameV1, src=BldInfo(CxiDg2_Pim))
#
# Then one can filter on these six srcname's:
#
#  NoDetector.0:Evr.2  CxiDs1.0:Cspad.0  CxiSc2.0:Cspad2x2.1  EBeam  FEEGasDetEnergy  CxiDg2_Pim
#

src_filter = include all

# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# CALIBRATION FILTERING
#
# Psana calibration modules can produce calibrated versions of different 
# data types. Depending on the module used, you may get an NDArray, an 
# image, or the same data type as was in the xtc but with calibrated data.
#
# If you are doing the latter, the module output will be data of the same type 
# and src as the uncalibrated data, with an additional key, such as 'calibrated'.
# If these modules are configured to use a different key, set calibration_key
# below accordingly:

calibration_key = calibrated

# The Translator defaults to writing calibrated data in place of uncalibrated
# data. If you do not want the calibrated data and would prefer to have the
# original uncalibrated data from the xtc, then set skip_calibrated to true.

skip_calibrated = false

# note, setting skip_calibrated to true will force sets exclude_calibstore 
# (below) to be true as well.

# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# CALIBSTORE FILTERING
#
# Calibration modules may publish the data they used to produce the calibrated
# event objects. Examples of data would be pedestal values, pixel status (what
# pixels are hot) and common mode algorithm parameters. This data will be published
# in what is called the Psana calibStore. When the Translator sees calibrated 
# event data, it will look for the corresponsinding calibStore data as well.
# If you do not want it to translate calibStore data, set the following to true.

exclude_calibstore = false

# otherwise, the Translator will create a group CalibStore that holds the
# calibstore data. Note, the Translator looks for all calibStore data associated 
# with the calibration modules. If a calibration module was configured to not do 
# certain calibrations (such as gain) but the module still put gain values
# in the config store (even though it did not use them) the Translator 
# would still translate those gain values.

# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# KEY FILTERING
#
# Psana modules loaded before the translator may put a variety of objects in the event 
# store. Be default, the Translator will translate any new data that it knows about.
# In addition to the psana types, it knows about NDArrays, C++ strings, and has a C++ interface 
# for registering new simple types. NDarray's up to 4 dimensions of 10 basic types 
# (8, 16, 32 and 64 bit signed and unsigned int, float and double) as well as the const 
# versions of these types are translated.
#
# Generally Psana modules will attach keys to these objects (the keys are simply strings).
# To filter the set of keys that are translated, modify the parameter below:

key_filter = include all

# The default is to not look at the key but rather translate all data that the translator
# knows about. An example of including only data with the key finalanswer would be
#
# key_filter = include finalanswer
#
# To exclude a few keys, one can do
#
# key_filter = exclude arrayA arrayB
#
# Note, key filtering does not affect translation of data without keys. For instance
# setting key_filter = include keyA does not turn off translation of data without keys.
# Of all the data with keys, only those where the key is keyA will be translated.
#
# ---------------------------------------
# SPLIT INTO SEPARTE HDF5 FILES BASED ON CALIB CYCLES
#
# There are two reasons to split the Translator output, the resulting hdf5 file is to 
# large, and to parallelize the translation and make it faster. The default is to 
# not split:

split = NoSplit

# however the Translator also supports SplitScan mode:
#
# split=SplitScan
#
# In SplitScan mode, in addition to the output File, one file will be made for every
# calib cycle. The output file (the master file) will include external links to the other files. 
# Several translator jobs may run simultaneously to divide the work of creating the calib cycle files.
# At this time, each Translator job reads through all the input, so launching too many jobs will
# significantly increase the amount of input processing.
# Dividing the work of SplitScan mode is done with the parameters

# jobTotal = 1
# jobNumber = 0

# which default to 1 job that is numbered 0. However if jobTotal=3 and jobNumber=1, this 
# Translator will process calibCycle 1, 4, 7, etc. If jobTotal is 3, the user MUST
# make sure to launch 3 Translator jobs with jobNumber being 0,1 and 2 to get all the calib cycle
# files written. jobNumber=0 will write the master file with the external links to the calib
# cycle files. 
#
# For example, the following two command lines:
#
# psana -m Translator.H5Output -o Translator.H5Output.output_file=mydir/split.h5 -o Translator.H5Output.split=SplitScan -o Translator.H5Output.jobNumber=0 -o Translator.H5Output.jobTotal=2 exp=xpp123:run=10
# psana -m Translator.H5Output -o Translator.H5Output.output_file=mydir/split.h5 -o Translator.H5Output.split=SplitScan -o Translator.H5Output.jobNumber=1 -o Translator.H5Output.jobTotal=2 exp=xpp123:run=10
#
# will divide the work into two translator jobs. When they finish, the output will be
# 
# mydir/split.h5
# mydir/split_cc0000.h5
# mydir/split_cc0001.h5
# ...
#
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # 
# COMPRESSION 
#
# The following options control compression for most all datasets.
# Shuffling improves compression for certain datasets. Valid values for
# deflate (gzip compression level) are 0-9. Setting deflate = -1 turns off
# compression.

shuffle = true
deflate = 1

# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# TECHNICAL, ADVANCED CONFIGURATION
# 
# ---------------------------------------
# CHUNKING
# The commented options below give the default chunking options.
# Objects per chunk are selected from the target chunk size (16 MB) and 
# adjusted based on min/max objects per chunk, and the max bytes per chunk.
# It is important that the chunkCache (created on a per dataset basis) be 
# large enough to hold at least one chunk, ideally all chunks we need to have
# open at one time when writing to the dataset (usually one, unless we repair
# split events):
 
# chunkSizeTargetInBytes = 1703936 (16MB)
# chunkSizeTargetObjects = 0 (0 means select objects per chunk from chunkSizeInBytes)
# maxChunkSizeInBytes = 10649600  (100MB)
# minObjectsPerChunk = 50              
# maxObjectsPerChunk = 2048
# chunkCacheSizeTargetInChunks = 3
# maxChunkCacheSizeInBytes = 10649600  (100MB)

# ---------------------------------------
# REFINED DATASET CONTROL
#
# There are six classes of datasets for which individual options for shuffle,
# deflate, chunkSizeTargetInBytes and chunkSizeTargetObjects can be specified:
#
# regular (most everything, all psana types)
# epics (all the epics pv's)
# damage (accompanies all regular data from event store)
# ndarrays (new data from other modules)
# string's (new data from other modules)
# eventId (the time dataset that also accompanies all regular data, epics pvs, ndarrays and strings)
#
# The options for regular datasets have been discussed above. The other five datasets 
# get their default values for shuffle, deflate, chunkSizeInBytes and chunkSizeInObjects
# from the regular dataset options except in the cases below:
 
# damageShuffle = false
# stringShuffle = false
# epicsPvShuffle = false
# stringDeflate = -1
# eventIdChunkSizeTargetInBytes = 16384
# epicsPvChunkSizeTargetInBytes = 16384

# The rest of the shuffle, deflate and chunk size options for the other five datasets are:
#
# eventIdShuffle = true
# eventIdDeflate = 1
# damageDeflate = 1
# epicsPvDeflate = 1
# ndarrayShuffle = true
# ndarrayDeflate = 1
# eventIdChunkSizeTargetObjects = 0
# damageChunkSizeTargetInBytes = 1703936
# damageChunkSizeTargetObjects = 0
# stringChunkSizeTargetInBytes = 1703936
# stringChunkSizeTargetObjects = 0
# ndarrayChunkSizeTargetInBytes = 1703936
# ndarrayChunkSizeTargetObjects = 0
# epicsPvChunkSizeTargetObjects = 0

# ---------------------------------------
# SPLIT EVENTS
# When the Translator encounters a split event, it checks a cache to see
# if it has already seen it.  If it has, it fills in any blanks that it can.
# To prevent this cache from growing to large, set the maximum number of
# split events to look back through here (default is 3000):

max_saved_split_events = 3000

...