This page includes material from PSDM space, due to that navigation on page does not work properly. To see original page go to psana - User Manual

Include Page

	psana - User Manual
	psana - User Manual

Table of Contents

Introduction

This document describes C++ analysis framework for LCLS and how users can make use of its features. Psana design borrows ideas from multitude of other framworks such as pyana, myana, BaBar framework, etc. It's main principles are summarized here:

support processing of both XTC and HDF5 data format
user code should be independent of specific data format
should be easy to use and extend for end users
support re-use of the existing analysis code
common simple configuration of user analysis code

This manual is accompanied by the Psana reference Manual which describes interfaces of the classes available in Psana.

Framework Architecture

The central part of the framework is a regular pre-built application (psana) which can dynamically load one or more user analysis modules which are written in C++. The core application is responsible for the following tasks:

loading and initializing all user modules
loading one of the input modules to read data from XTC or HDF5
calling appropriate methods of user modules based on the data being processed
providing access to data as set of C++ classes
providing other services such as histogramming to user modules

Other important components of the Psana architecture:

user module – instance of the C++ class which inherits pre-defined Module class and defines few special methods which are called by framework
event – special object which transparently stores all event data
environment – special object which stores non-event data such as configuration objects or EPICS data

Analysis Job Life Cycle

Psana analysis job goes through cycles of state changes such as initialization, configuration, event processing, etc. calling methods of the user modules at every such change. This model follows closely the production activities in LCLS on-line system. DAQ system defines many types of transitions in its data-taking activity, most interesting are here:

Configure - provides configuration data for complete setup
BeginRun - start of data taking for one run
BeginCalibCycle - start of the new scan, some configuration data may change at his point
L1Accept - this is regular event containing event data from all detectors
EndCalibCycle - end of single scan
EndRun - end of data taking for one run
Unconfigure - stop of all activity

Typically there will be more than one run taken with the same configuration, so there may be more than one BeginRun/EndRun transition for one Configure/Unconfigure, but a data file from single run should contain only one BeginRun/EndRun. Depending on a setup there could be one or more BeginCalibCycle/EndCalibCycle transitions in single run.

For each of the above transitions psana will call corresponding method in user modules notifying them of the possible change in the configuration or just providing event data. Following method names are defined in the user modules:

beginJob() – this method is called once per analysis job when first Configure transition happens. If there is more than one Configure in single job (when processing multiple runs) this method is not called, use beginRun() to observe configuration changes in this case. This method can access all configuration data through environment object.
beginRun() – this method is called for every new BeginRun, so it will be called multiple times when processing multiple runs in the same job. This method can access all configuration data through environment object.
beginCalibCycle() – this method is called for every new BeginCalibCycle, so it will be called multiple times when processing multiple runs in the same job or when single run contains multiple scans. This method can access all configuration data through environment object.
event() – this method is called for every new L1Accept, it has access to event data through event object as well as configuration data through environment object.
endCalibCycle() – this method is called for every new EndCalibCycle, it has access to configuration data through environment object.
endRun() – this method is called for every new EndRun, it has access to configuration data through environment object.
endJob() – this method is called once at the end of analysis job, it has access to configuration data through environment object.

Typically psana will iterate through all transitions/events from the input files. User modules have a limited control over this event loop, module can request to skip particular event, stop iteration early or abort job using one of the methods described below.

User Modules

User module in psana is an instance of the C++ class which inherits from the Module class (defined in file pasana/Module.h) and implements several methods. These methods are already mentioned above, here is more formal description of each method:

void beginJob(Env& env)
Method called once at the beginning of the job. Environment object contains configuration data from the first Configure transition. Default implementation of this method does not do anything.
void beginRun(Env& env)
Method called at the beginning of every new run. Default implementation of this method does not do anything.
void beginCalibCycle(Env& env)
Method called at the beginning of every new scan. Default implementation of this method does not do anything.
void event(Event& evt, Env& env)
Method called for every regular event. Even data is accessible through =evt= argument. There is no default implementation for this method and user module must provide at least this method.
void endCalibCycle(Env& env)
Method called at the end of every new scan, can be used to process scan-level statistics collected in event(). Default implementation of this method does not do anything.
void endRun(Env& env)
Method called at the end of every run, can be used to process run-level statistics collected in event(). Default implementation of this method does not do anything.
void endJob(Env& env)
Method called once at the end of analysis job, can be used to process job-level statistics collected in event(). Default implementation of this method does not do anything.

In addition to event() method every module class must provide a constructor which takes string argument giving the name of the module. Additionally it has to provide a special factory function use to instantiate the modules from the shared libraries, there is special macro defined for definition of this factory function.

Here is the minimal example of the module class declaration with only the event() method implemented and many non-essential details are skipped:

Code Block

title	Package/ExampleModule.h
borderStyle	solid


#include "psana/Module.h"

namespace Package {
class ExampleModule: public Module {
public:

  // Constructor takes module name as a parameter
  ExampleModule(const std::string& name);

  // Implementation of event() from base class
  virtual void event(Event& evt, Env& env);

};
} // namespace Package

Definition of the factory function and methods:

Code Block

title	Package/ExampleModule.cpp
borderStyle	solid


#include "Package/ExampleModule.h"
#include "MsgLogger/MsgLogger.h"
#include "PSEvt/EventId.h"

// define factory function
using namespace Package;
PSANA_MODULE_FACTORY(ExampleModule)

// Constructor
ExampleModule::ExampleModule(const std::string& name)
  : Module(name)
{
}

void 
ExampleModule::event(Event& evt, Env& env)
{
  // get event ID
  shared_ptr<EventId> eventId = evt.get();
  if (not eventId.get()) {
    MsgLog(name(), info, "event ID not found");
  } else {
    MsgLog(name(), info, "event ID: " << *eventId);
  }
}

This simple example already does something useful, it retrieves and prints event ID (copied from standard PrintEventId module). Actual modules will do more complex things but this is a simple example of obtaining something from event data.

The easiest way to write new user modules is to use codegen script to generate class from predefined template. This command will create new module ExampleModule in package TestPackage and will copy generated files to the directories in TestPackage:

Code Block
codegen -l psana-module TestPackage ExampleModule

Data Access in User Modules

As already mentioned above all event data is accessible to user module via Event object, and all non-event data is accessible through Env object. Previous example shows simple use case of extracting data from the event. This section give more detailed description of the Event and Env types and their methods.

When extracting data from event or environment it is necessary to specify at least the type of the data (EventId in the above example). If there are multiple object of the same type in the event then an additional identifying information must be provided – source address and/or additional string key.

Data Source Address

Many pieces of data in the event originate from devices or processes which are parts of the LCLS DAQ. Devices in DAQ system are identified their addresses, which are special C++ data types. There are three types of addresses defined by DAQ:

DetInfo (class name Pds::DetInfo) – this is the most frequently used type and it defines all regular devices used in DAQ such as cameras, Acqiris, etc. Complete address specification includes 4 items:
- Detector type, one of the Pds::DetInfo::Detector enum values.
- Detector ID, a number, in case there is more than one detector of the same type in a system they will have different IDs.
- Device type, one of the Pds::DetInfo::Device enum values.
- Device ID, a number, in case there is more than one device of the same type in a system they will have different IDs.
BldInfo (class name Pds::BldInfo) – this address type is used for Beam Line Data sources, particular source is identified by the Pds::BldInfo::Type enum value.
ProcInfo (class name Pds::ProcInfo) – this address type is used rarely, and only for information produced by applications constituting DAQ. Sources of this type are identified by IP address of the host where application is running.

(If you look at the C++ code you'll notice that all above classes also include process ID, but it is not used by psana and can be set to 0 if needed.)

User modules should not need to use above C++ classes directly, instead psana provides facility that simplifies specification of the addresses and does not require exact addresses to be known. Class which provides support for these features is called Source (full name is PSEvt::Source). It can be constructed from one of the three above classes, but the most interesting use case is the constructor which accepts string specification of an address. The string specification accept following string formats:

"DetInfo(Detector.DetID:Device.DevID)"
Corresponds to DetInfo address type. Detector is the detector name (one of the names of the constants in Pds::DetInfo::Detector enum. DetID is a detector ID number. Device is the device name (one of the names of the constants in Pds::DetInfo::Device enum. DevID is a device ID number. Any or all parts of the specification may be missing. If detector ID or device ID is missing then separating dot is optional. If both device and device ID are missing the separating colon is optional. Missing parts could also be replaces with wildcard '*' symbol.
"Detector.DetID:Device.DevID"
Same as the above specification, DetInfo and parentheses can be omitted.
"Detector-DetID|Device-DevID"
Same as above, this format is supported for compatibility with pyana but is deprecated.
BldInfo(BldType)
Corresponds to BldInfo address type. BldType is one of the names of the constants in Pds::BldInfo::Type enum (currently defined types are EBeam, PhaseCavity, FEEGasDetEnergy, Nh2Sb1Ipm01). BldType can be omitted.
BldType
Same as above, but you cannot omit BldType here.
ProcInfo(ipAddr)
Corresponds to ProcInfo address type. ipAddr is an IPv4 address in decimal dot notation (123.123.123.123). ipAddr can be omitted.

If the specification includes all pieces then specification is exact and can only match a single data source. If there are missing parts in specification then specification is a match. When requesting data from event with match specification there may be more than one source of data matching it. In this case the first matching source (in unspecified order) will be used. Inexact specification can simplify data access when exact addresses are not known in advance, but one has to be careful if there are multiple devices matching the same address.

Here are few examples of the exact address specifications:

"DetInfo(AmoITof.0:Acqiris.0)"
"AmoITof.0:Acqiris.0" – same as above
"DetInfo(SxrEndstation.0:Opal1000.0)"
"BldInfo(FEEGasDetEnergy)"
"FEEGasDetEnergy" – same as above
"BldInfo(FEEGasDetEnergy)"
"ProcInfo(0.0.0.0)"

Here are the examples of the address matches:

"DetInfo(AmoITof.*:Acqiris.*)"
"DetInfo(AmoITof:Acqiris)" – same as above
"AmoITof:Acqiris" – same as above
"DetInfo(AmoITof:*)"
"DetInfo(AmoITof)" – same as above
"AmoITof" – same as above
"DetInfo(*:Acqiris)"
"DetInfo(:Acqiris)" – same as above
"*:Acqiris" – same as above
"DetInfo(*.*:*.*)"
"DetInfo()" – same as above
"BldInfo()"
"" – will match any address type

String Key

Additional key that may be provided when storing or retrieving the data from event is used to distinguish between data objects of the same type and address. As an example the raw data that come from XTC file are stored with the default empty key. User algorithm can apply some algorithm to the data and store new version of the same data using non-default key (such as "fixed" or "calibrated").

Event Data

Event data can be

Beginning November 4, 2024, login to Confluence and Jira will change. Read more.

Page tree

Versions Compared

Old Version 8

New Version Current

Key

Introduction

Framework Architecture

Analysis Job Life Cycle

User Modules

Data Access in User Modules

Data Source Address

String Key

Event Data

Beginning November 4, 2024, login to Confluence and Jira will change. Read more.

Page tree

Page History

Versions Compared

Old Version 8

New Version Current

Key

Introduction

Framework Architecture

Analysis Job Life Cycle

User Modules

Data Access in User Modules

Data Source Address

String Key

Event Data