Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents

1. About this

...

Tutorial

The main objective of this session is to introduce and to explain the new Python interface for accessing LCLS data from analysis applications. The new This software framework is known as the "Interactive psana" or just ipsana for short. The first idea of implementing such tool was suggested around 1.5 years ago at the joint PCDS/SRD meeting (look for ipsana). Though its underlying machinery is largely based on the batch version of psana the interface to the data is much simplesimpler, more intuitive, and it requires much less code to be written by a user in order to get to "that CSPad image" ( or that EPICS PV, etc.). The new framework won't work for everyone, especially specifically for those users who have either heavily invested into the modular code of the batch frameworksframework, or who needs need the performance of modules written in C++. Still our intent is to demonstrate the power of the new approach and to encourage using the tool where it seems to be appropriate.

...

Scope of this

...

Tutorial

  • This isntutorial doesn't an explanation how to do the data analysis. Note that our goal is to explain provide any real data analysis, it's just an explanation of basic techniques for getting to accessing your data, not for using it!
  • This is not a Users or Reference Guide Manual for the interactive framework

The pyana users, attention!

As it's been announced earlier, the pyana framework (warning) Users of pyana and myana, attention: The pyana and myana frameworks will be phased out at some point. There is a variety of reasons why:

...

 As our understanding of what kind of analysis framework

...

works better for

...

the users and for ourselves as developers grew

...

evolved, we realized that we needed to develop a tool which would have

...

an easier interface to the data, a better internal architecture and be

...

easier to maintain and extend for new data types

...

. Hence we

...

developed psana framework which has a number of advantages

...

:

...

  1. Better API to the data.

...

  1. Possibility of writing (mixing) modules in C++ and Python. Modules written in different languages will still see the same data, and they can also exchange data within the framework.

...

  1. Supporting both XTC and HDF5 and files.

...

  1. Ability to read the "live" files

...

  1. while they're being recorded by the DAQ system or data movers

...

  1. .

...

  1. Ability to read data from shared memory

...

  1. on the monitoring machines.

The last two features are opening an interesting the powerful possibility of using psana for real-time monitoring of data while benefiting from reusing
the reusing the same code which might be developed for the traditional OFFLINE processing/analysis.

2.

...

Getting Started

Example Location

All examples can be found under the following directory

The location of examples

We put all examples for today's session at the following diretory:

Code Block
/reg/g/psdm/tutorials

Data

...

Files

In order to make our examples as close to the "real" analysis environment as possible we chose to create 6 pseudo experiments (one per instrument):

...

All data files are open for reading by anyone who can log onto PCDS computers. Moreover, those directories (like scratch/, ftc/) are open for writing by anyone. And yes, one can also see these experiments in the Web Portal.

Setting up

...

the Environment

  • make Make sure you can run X11 applications. Most examples of this tutorial will do a simple visualization.will do a simple visualization. You can pass the -X or -Y argument to ssh to make sure you can forward the screens to your local machine, eg:
    Code Block
    
    ssh -Y psexport
    
  • Log log onto any machine of the interactive analysis clusters pools psananeh or psanafeh, eg:
    Code Block
    
    ssh psananeh
    
  • Make make sure you sources source (just once) one of the following scripts (depending on which UNIX shell you are using). When using the bash shell:
    Code Block
    . /reg/g/psdm/etc/sit_env.sh
    
    Or, when using the csh shell family:
    Code Block
    sit_env.sh
    source /reg/g/psdm/etc/sit_env.csh
    
    Note that the default shell for most LCLS users is bash.
  • Run run (just once) the following command which will set up a proper OFFLINE Analysis analysis environment for the latest analysis release:
    Code Block
    sit_setup ana-current
    

At this point you must be are ready to go. To test that the your environment is set up correctly try running psana without any parameters. If your environment is properly set you should see something like this:

Code Block
psana
[error:2013-06-06 20:54:44.131:PSAnaApp.cpp:218] no analysis modules specified

3. Basic

...

Examples

This section presents a few simple scripts which have been developed to underline the main ideas behind the framework's API. The code of the examples along with a simple HOWTO file can be found at:

Code Block
/reg/g/psdm/tutorials/common/data_access_methods/
Printing

...

Identifiers for all Events in a Run

First try this:

Code Block
./print_event_id.py

Then look at the code. It will do three things:

  • import Import the psana module:
    Code Block
    import psana
    
  • open Open the data set. Note the syntax for the data set specification string:
    Code Block
    dsname = "exp=sxrtut13:run=366"
    ds = psana.DataSet(dsname)
    
  • note Note that by default the framework will look for XTC files at the standard location where all experimental data are supposed to be. If you want to play with HDF HDF5 (in case if there are is an HDF5 * version of the run) you may slightly can change that string by appending h5 in the end:
    Code Block
    dsname = "exp=sxrtut13:run=366:h5"
    
  • the The next thing which this code will do will be to iterate over all events. At each step you will get a reference to an event object evt and it will extract and print an identifier of the event:
    Code Block
    for ievtnum, evt in enumerate(ds.events()):
        evtnum = i + 1
        id = evt.get(psana.EventId)
        print "%6d:" % evtnum, id
    

...

Code Block
./print_event_id.py
     1: XtcEventId(run=366, time=2013-04-21 04:37:39.343773772-07, fiducials=38877, ticks=329342, vector=19553)
     2: XtcEventId(run=366, time=2013-04-21 04:37:39.360457259-07, fiducials=38883, ticks=331442, vector=19554)
     3: XtcEventId(run=366, time=2013-04-21 04:37:39.377123777-07, fiducials=38889, ticks=330560, vector=19555)
     4: XtcEventId(run=366, time=2013-04-21 04:37:39.393797466-07, fiducials=38895, ticks=329762, vector=19556)
     5: XtcEventId(run=366, time=2013-04-21 04:37:39.410477971-07, fiducials=38901, ticks=331204, vector=19557)
     6: XtcEventId(run=366, time=2013-04-21 04:37:39.427145705-07, fiducials=38907, ticks=331036, vector=19558)
     7: XtcEventId(run=366, time=2013-04-21 04:37:39.443816588-07, fiducials=38913, ticks=329370, vector=19559)
     8: XtcEventId(run=366, time=2013-04-21 04:37:39.460499778-07, fiducials=38919, ticks=331414, vector=19560)
     9: XtcEventId(run=366, time=2013-04-21 04:37:39.477167658-07, fiducials=38925, ticks=330616, vector=19561)
    10: XtcEventId(run=366, time=2013-04-21 04:37:39.493840079-07, fiducials=38931, ticks=329720, vector=19562)
    ...
Printing a

...

Catalog of

...

Event Components

The sample example can be run like this:

Code Block


./print_event_keys.py

Components of the first event found in the dataset:
  EventKey(type=psana.EvrData.DataV3, src='DetInfo(NoDetector.0:Evr.0)')
  EventKey(type=psana.Acqiris.DataDescV1,         src='DetInfo(SxrEndstation.0:Acqiris.0)')
  EventKey(type=psana.Acqiris.DataDescV1,         src='DetInfo(SxrEndstation.0:Acqiris.1)')
  EventKey(type=psana.Bld.BldDataEBeamV3,         src='BldInfo(EBeam)')
  EventKey(type=psana.Bld.BldDataPhaseCavity,     src='BldInfo(PhaseCavity)')
  EventKey(type=psana.Bld.BldDataFEEGasDetEnergy, src='BldInfo(FEEGasDetEnergy)')
  EventKey(type=psana.Bld.BldDataGMDV1,           src='BldInfo(GMD)')
  EventKey(type=psana.EventId)
  EventKey(type=None)

Why it's so important to know this information? Because these parameters will tell you:

  • whatWhat's inside the event
  • and how to extract the corresponding data objects associated with these keys

The above shown output will translate into the following getters (similar to the one which is used in the very first example extracting event identifiers)type and the source of the output shown above will allow you to access the data you need by using the following get functions:

Code Block


obj = evt.get( psana.EvrData.DataV3,             psana.Source('DetInfo(NoDetector.0:Evr.0)'))
obj = evt.get( psana.Acqiris.DataDescV1,         psana.Source('DetInfo(SxrEndstation.0:Acqiris.0)'))
obj = evt.get( psana.Acqiris.DataDescV1,         psana.Source('DetInfo(SxrEndstation.0:Acqiris.1)'))
obj = evt.get( psana.Bld.BldDataEBeamV3,         psana.Source('BldInfo(EBeam)'))
obj = evt.get( psana.Bld.BldDataPhaseCavity,     psana.Source('BldInfo(PhaseCavity)'))
obj = evt.get( psana.Bld.BldDataFEEGasDetEnergy, psana.Source('BldInfo(FEEGasDetEnergy)'))
obj = evt.get( psana.Bld.BldDataGMDV1,           psana.Source('BldInfo(GMD)'))
obj = evt.get( psana.EventId)

 = evt.get( psana.EventId)

You already encountered one of these get functions in the very first example where you were extracting the event identifier. The type field indicates what kind of data you are accessing, the source field indicates the instance of that particular detector. In this example there are two acqiris digitizers with the same data type. Note Note that event components obtained through this API will be objects of various classes. A full catalog of those objects can be found in the DOXYGEN documentation which is auto-generated from the code of the OFFLINE releases.

Iterating over scans and events

Some of our experiments (in particular XPP) are heavily relying on so called scans (, also known as "Calibration Transitions*) while taking their data. Each DAQ run has one or many scans. Events are recorded in a scope of a scan. The new framework has a special provision for scans through the iterator of scans. The idea begin behind the following example is:

  • open Open a data set which has multiple scans in each run
  • iterate Iterate over scans
  • iterate Iterate over events in each scan

This simple application knows about scan boundaries. More over, this example illustrates how to open a data set composed of many runs (processing a series of runs at once). There are two examples in this set:

Code Block


./scans_in_runs_xtc.py
./scans_in_runs_hdf.py

They both do the same. The only subtle difference is (info)  These two scripts perform the same processing, the only difference being which data format they're suingaccessing. The first example will read XTC files, while the second one will read HDF5 files. When running these examples you should notice differences in their performance. They're explained by different organization of data in XTC vs HDF5 formats. We'll be happy to provide you with an explanation if you'll be interested in itwill notice that the HDF5 version is significantly faster. This is due to the fact that we don't yet support indexing for XTC files.

4. Instrument

...

Specific Examples

This section includes a number of examples which are relevant for to different instruments. Their primary meaning goal is to illustrate how to access data objects which are specific to each instrument.

XCS

...

Princeton Movie

The code of for these examples is found at:

Code Block
/reg/g/psdm/tutorials/xcs/princeton_movie/

SXR

...

Correlation Plots for Signals from GDM and Diode

The code of for these examples is found at:

Code Block
/reg/g/psdm/tutorials/sxr/gmd_vs_diode/

CXI

...

Diffraction Patterns on the CSPad

...

Detector

The code of for these examples is found at:

Code Block
/reg/g/psdm/tutorials/cxi/cspad_imaging/

There are three tests in this directory. They demonstrate progressivethree simple steps which could be done to extract and pre-process images taken with the CSPad detector.examples in this directory showing an increasingly complex processing of the CSPad detector. These example also introduce the ability to provide a configuration file describing the parameters of the analysis. These configuration files follow the usual psana syntax.

  1. dump_2x1_elementsthe : The first test will illustrate illustrates how individual 2x1 structures can be located from an event and displayed AS-IS w/o any processing. This test also won't use any psana modules.
  2. frame_recothe : The second example adds one of the standard psana modules in order to reconstruct a full CSPad frame from the corresponding 2x1 components. This test is based on the interactive psana's ability to run events through an optional chain of modules. The modules are specified and configured via an external configuration file 'frame_reco.cfg' which has the usual psana syntax.
  3. frame_reco_calibthe : The last example will add one more module to calibrate (pedestals subtraction and gain correction) reconstructed CSPad images. The modules are configured in an external configuration file 'frame_reco_calib.cfg'

Please, run these test in the same sequence as they're explained order described above:

Code Block
./dump_2x1_elements.py
./frame_reco.py
./frame_reco_calib.py

5.

...

Something Less Trivial

Custom HDF5 translator

...

Let's suppose

...

the user doesn't want to use the standard HDF5 translator, but prefers to write a data extraction tool to extract

...

a particular detector from XTC files and make

...

the data available for further analysis in Matlab. At this point we should already know how to get images from the raw files using ipsana. Now the only remaining problem is to store them in some form which may be readable from Matlab

...

Perhaps the best way to solve the problem would be to store those images in an HDF5 file using some library. And this is what this example offers. It . This example uses the PyTables package to dump numpy arrays into an out put filesHDF5 output file. This package is known for its simple API which doesn't require a user to learn the low-level library h5py.

The code of for this example is found at:

Code Block
/reg/g/psdm/tutorials/common/hdf5_translator/