Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Include Page
PSDM:PageMenuBegin
PSDM:PageMenuBegin
Table of Contents
Include Page
PSDM:PageMenuEnd
PSDM:PageMenuEnd

Introduction

This document describes C++ analysis framework for LCLS and how users can make use of its features. Psana design borrows ideas from multitude of other framworks such as pyana, myana, BaBar framework, etc. It's main principles are summarized here:

...

The central part of the framework is a regular pre-built application (psana) which can dynamically load one or more user analysis modules which are written in C++ or Python. The core application is responsible for the following tasks:

  • loading and initializing all user modules
  • loading one of the input modules to read data from XTC or HDF5
  • calling appropriate methods of user modules based on the data being processed
  • providing access to data as set of C++ classes and a set of Python classes
  • providing other services such as histogramming to user modules

...

  • user module – instance of the C++ or Python class which inherits pre-defined Module class and defines few special methods which are called by the framework
  • event – special object which transparently stores all event data
  • environment – special object which stores non-event data such as configuration objects or EPICS data

...

Typically psana will iterate through all transitions/events from the input files. User modules have a limited control over this event loop, module can request to skip particular event, stop iteration early or abort job using one of the methods described below.

User Modules

User module in psana is A user module provides an instance of a class that inherits from the Psana Module class. Below we discuss this for C++ class which inherits from the . The Psana Module class (is defined in the file pasanapsana/Module.h) and implements several methods. These methods are already mentioned above, here is more formal description of each method:

...

In addition to event() method every module class must provide a constructor which takes a string argument giving the name of the module. Additionally it has to provide a special factory function use used to instantiate the modules from the shared libraries, there is special macro defined for definition of this factory function.

Here is the minimal example of the module class declaration with only the event() method implemented and many non-essential details are skipped:

Code Block
borderStylesolid
titlePackage/ExampleModule.h
borderStylesolid
#include "psana/Module.h"

namespace Package {
class ExampleModule: public Module {
public:

  // Constructor takes module name as a parameter
  ExampleModule(const std::string& name);

  // Implementation of event() from base class
  virtual void event(Event& evt, Env& env);

};
} // namespace Package

Definition of the factory function and methods:

Code Block
borderStyle
borderStylesolid
titlePackage/ExampleModule.cppsolid
#include "Package/ExampleModule.h"
#include "MsgLogger/MsgLogger.h"
#include "PSEvt/EventId.h"

// define factory function
using namespace Package;
PSANA_MODULE_FACTORY(ExampleModule)

// Constructor
ExampleModule::ExampleModule(const std::string& name)
  : Module(name)
{
}

void
ExampleModule::event(Event& evt, Env& env)
{
  // get event ID
  shared_ptr<EventId> eventId = evt.get();
  if (not eventId.get()) {
    MsgLog(name(), info, "event ID not found");
  } else {
    MsgLog(name(), info, "event ID: " << *eventId);
  }
}

...

Skipped events can be used in further analysis or saved in the "filtered" Xtc file, as explained in Package PSXtcOutput.

Job and Module Configuration

...

The parameters that are needed for the framework are defined in psana modules section. Here is the list of parameters which can appear in that section:

  • modules
    list of module names to include in the analysis job. Each module name is built of a package name and class name separated by dot (e.g. TestPackage.ExampleModule) optionally followed by colon and modifier. Modifier is not needed if there is only one instance of the module in the job. If there is more than on instance then modules need to include unique modifier to distinguish instances. If the module comes from psana package then package name can be omitted. Module names can also be specified on the command line with -m option, for multiple modules use multiple -m options or comma-separated names in single -m option.
  • input or files
    specifies input data, list of datasets or file names to process. File names Input data can also be specified on the command line which will override anything specified in configuration file. See section Specifying input data for more details on dataset syntax.
  • events
    maximum number of events to process in a job, can also be given on the commnad command line with -n or --num-events option.
  • skip-events
    number of events to skip before starting even processing, can also be given on the commnad line with -s or --skip-events option.
  • instrument
    Instrument name.
  • experiment
    Experiment name. Instrument and expriment names can be specified on the commnad line with -e or --experiment option, option value has format XPP:xpp12311 or xpp12311. By default instrument and experiment names are determined from input file names, you can use these options to override defaults (or when your file has non-standard naming).
  • calib-dir
    Path to the calibration directory, can also be given on the commnad line with -b or --calib-dir option. Path can include {instr} and {exp} strings which will be replaced with instrument and experiment names respectively. Default value for path is /reg/d/psdm/{instr}/{exp}/calib.

...

Parameters for user modules appear in the separate sections named after the modules. For example the module with name "TestPackage.ExampleModule" will read its parameters from the section [TestPackage.ExampleModule]. If the module name includes modifier after colon then it will try to find parameter value in the corresponding section first and if it does not exist there it will try to read parameter form section which does not have modifier. In this way the modules can share common parameters. For example the module "TestPackage.ExampleModule:test" will try to read a parameter from [TestPackage.ExampleModule:test] section first and [TestPackage.ExampleModule] section after that.

To help manage configuration options, Psana provides a way select between several sets of parameters in a config file, as well as to override a default set with a few specific values. When specifying a module to load, it can be tagged as follows:

modules = TestPackage.Analysis:mode1

The modifier after the colon tells Psana to first look for configuration parameters in the section [TestPackage.Analysis:model] and then in the section [TestPackage.ExampleModule]. It is also possible to load the same module several times, specifying different configuration options for each instance. Psana will construct each instance with a different name - based on the tag provided.

Here is an example of configuration for Here is an example of configuration for some fictional analysis job:

...

By default psana enables messages of the info level (and higher). To enable lower level messages one can provide -v option to psana: one -v will enable trace messages, two -v options will enable debug messages. To disable info and warning messages one can provide one or two -q options. Error and fatal messages cannot be disabled.

Note: when the message level is disabled the code in the corresponding macros is not executed at all. Do not put any expressions with side effects into message or code blocks, these are strictly for messaging, not part of your algorithm.

Histogramming Service

Psana includes a histogramming service which is wrapper for ROOT histogramming package. This service simplifies several tasks such as opening ROOT file, saving histograms to file, etc.

Center piece of the histogramming service is the histogram manager class. Histogram manager's responsibilities is to open ROOT file, create histograms, and to store histograms to the file. All these tasks are performed transparently to user, there is no need for additional configuration of this service. To create histograms one needs first to obtain a reference to a manager instance which is a part of the standard psana environment and is accessible through a method of the environment class. One then can call factory methods of the manager class to create new histograms which will be automatically saved to a ROOT file. The manager creates a single ROOT file to store all histograms created in a single job. Then name of the ROOT file is the same as the job name with ".root" extension added. The name of psana job is auto-generated from the name of the first input file, but it can also be set on the command line with -j <job-name> option.

All factory methods of the histogram manager use special class to describe histogram axis (or axes for 2-dim histograms). The name of the class is PSHist::Axis (in the user module PSHist:: prefix is optional) and it contains binning information for single histogram axis. It can be constructed in two different ways:

  • Axis(int nbins, double amin, double amax)
    defines axis with fixed-width bins in the range from amin to amax.
  • Axis(int nbins, const double* edges)
    defines axis with variable-width bins, array contains the low edge of each bin plus high edge of the last bin. Total size of the edges array must be nbins+1.

Here is the list of the factory methods (see also reference for more information):

  • PSHist::H1* hist1i(const std::string& name, const std::string& title, const Axis& axis)
    creates one-dimensional histogram with integer bin contents. Returns pointer to histogram object.
  • PSHist::H1* hist1d(name, title, axis)
    (argument types same as above) creates one-dimensional histogram with double (64-bit) bin contents. Returns pointer to histogram object.
  • PSHist::H1* hist1f(name, title, axis)
    creates one-dimensional histogram with float (32-bit) bin contents. Returns pointer to histogram object.
  • PSHist::H2* hist2i(name, title, xaxis, yaxis)
    creates two-dimensional histogram with integer bin contents. Returns pointer to histogram object.
  • PSHist::H2* hist2d(name, title, xaxis, yaxis)
    creates two-dimensional histogram with double (64-bit) bin contents. Returns pointer to histogram object.
  • PSHist::H2* hist2f(name, title, xaxis, yaxis)
    creates two-dimensional histogram with float (32-bit) bin contents. Returns pointer to histogram object.
  • PSHist::Profile* prof1(name, title, xaxis, const std::string& option="")
    creates profile histogram, option string can be empty, "s", or "i", for meaning see reference. Returns pointer to histogram object.

User code should store the returned histogram pointers (as the module data members) and use is later in the code, there is no way currently to retrieve a pointer to the histogram created earlier.

Here is an example of the correct use of the histogramming package (from psana_examples.EBeamHist module):

Code Block
// ==== EBeamHist.h ====
class EBeamHist: public Module {
public:
  .....
private:
  Source m_ebeamSrc;
  PSHist::H1* m_ebeamHisto;
  PSHist::H1* m_chargeHisto;
};

// ==== EBeamHist.cpp ====
EBeamHist::EBeamHist(const std::string& name)
  : Module(name)
  , m_ebeamHisto(0)
  , m_chargeHisto(0)
{
  m_ebeamSrc = configStr("eBeamSource", "BldInfo(EBeam)");
}

void EBeamHist::beginJob(Env& env)
{
  m_ebeamHisto = env.hmgr().hist1i("ebeamHisto", "ebeamL3Energy value", Axis(1000, 0, 50000));
  m_chargeHisto = env.hmgr().hist1i("echargeHisto", "ebeamCharge value", Axis(250, 0, 0.25));
}

void EBeamHist::event(Event& evt, Env& env)
{
  shared_ptr<Psana::Bld::BldDataEBeamV1> ebeam = evt.get(m_ebeamSrc);
  if (ebeam.get()) {
    m_ebeamHisto->fill(ebeam->ebeamL3Energy());
    m_chargeHisto->fill(ebeam->ebeamCharge());
  }
}

More extensive example is available in Psana User Examplesthe message level is disabled the code in the corresponding macros is not executed at all. Do not put any expressions with side effects into message or code blocks, these are strictly for messaging, not part of your algorithm.

Writing User Modules

Here are few simple steps and guidelines which should help users to write their analysis modules.

  • Everything is done in the context of the off-line analysis releases, your environment should be prepared and you should have test release setup based on one of the recent analysis releases. Consult Workbook which should help you going.
  • You need your own package which may host several analysis modules. Package name must be unique. If the package has not be created yet run this command:

    Code Block
    newpkg MyPackage
    mkdir MyPackage/include MyPackage/src
    
  • Generate skeleton module class from template:

    Code Block
    codegen -l psana-module MyPackage MyModule
    

    this will create two files: MyPackage/include/MyModule.h and MyPackage/src/MyModule.cpp

  • Edit these two files, add necessary data members and implementation of the methods.
  • For examples of accessing different data types see collection of modules in psana_examples package. Reference for all event and configuration data types is located at https://pswww.slac.stanford.edu/swdoc/releases/ana-current/psddl_psana/
  • Reference for other classes in psana framework:  Psana Reference Manual
  • Run scons to build the module library.
  • Create psana config file if necessary.
  • Run psana providing input data, configuration file, etc.
  • It is also possible that somebody wrote a module which you can reuse for your analysis, check the module catalog: Psana psana - Module Catalog

To add your own compiler or linker options to the build (such as to link to a third party library), see this section on customizing the scons build.

Running Psana

After writing and compiling the modules (or choosing standard modules) one can run psana application with these modules. Psana application is pre-built and does not need to be recompiled. To start application one needs to either provide a configuration file or corresponding command-line options. Some information (e.g. user module options) cannot be specified on the command line and always require configuration file. Here is the list of command-line options recognized by psana:

...

More advanced and recommended way is to provide input data as a special dataset string. The dataset string encodes various parameters, some of which are needed to locate data files, while others specify optional behavior such as filtering or live data reading. The general syntax of the dataset string is :

...

a list colon-separated parameters, parameters have optional values separated from parameter name by equal sign:

Code Block
languagenone

...

param[=

...

value][:

...

param[=

...

value][...]

...

These are some of the parameters which are supported in psana:

  • experiment name (which may optionally contain the name of an instrument)


     

    Code Block
    languagenone
    exp=CXI/cxi12313
    exp=cxi12313
  • run number specification (can be a single run, a range of runs, a series of runs, or a combination of all above)

     

    Code Block
    languagenone
    run=1
    run=10-20
    run=1,2,3,4
    run=1,20-20,31,41

     

     
  • file type, if not specified then 'xtc' is the default

     

    Code Block
    languagenone
    xtc
    h5
  • Location of the files, if not specified then files will be searched in a standard location (/reg/d/psdm/...). If this parameter is specified it needs to be full path name of the directory where files are located

    Code Block
    languagenone
    dir=/reg/d/ffb/cxi/cxi12345/xtc
  • Input number stream number for XTC files, if value is omitted then one pseudo-random stream is selected (this is useful to balance the load on FFB storage system for example):

    Code Block
    languagenone
    one-stream=1
    one-stream
  • allow reading from live XTC files while they're still being recorded (by the DAQ or by the Data Migration service). Note that this feature is only available when running psana at PCDS, in all other cases the option will be ignored:

    Code Block
    languagenone
    live

     

     

Few examples of dataset specification:

...

A set of psana modules is available in current release as explained in Psana Module Catalog. Part of them demonstrates how data can be accessed from user module code . Other modules can be used in data analysis or event filtering. Example of application for these modules are available in separate document:

We permanently work on algorithms implemented in continually develop algorithms for the standard set of the psana modules. If you find that the algorithm which you need is missing in our collection you have two options:

...

we would be interested in hearing about it (email pcds-help@slac.stanford.edu). We are interested in implementing algorithms that are useful to our users. Of course, following this document, you can develop a Psana modules that implements the algorithm. A resource for sharing the module is the Users' Software Repository.