Page History
Include Page | ||||
---|---|---|---|---|
|
Table of Contents |
---|
Include Page | ||||
---|---|---|---|---|
|
1. About this
...
Tutorial
The main objective of this session is to introduce and to explain the new Python interface for accessing LCLS data from analysis applications. The new This software framework is known as the "Interactive psana" or just ipsana. The first idea of implementing such tool was suggested around 1.5 years ago at the joint PCDS/SRD meeting (look for ipsana Interactive Psana). Though its underlying machinery is largely based on the batch version of psana modules the interface to the data is much simplesimpler, more intuitive, and it requires much less code to be written by a user in order to get to "that CSPad image" ( or that EPICS PV, etc. ). The new framework won't work for everyone, especially specifically for those users who have either heavily invested into the modular code of the batch frameworksframework, or who needs need the performance of modules written in C++. Still our intent is to demonstrate the power of the new approach and to encourage using the tool where it seems to be appropriate.
...
Scope of this Tutorial
- This isntutorial doesn't an explanation how to do the data analysis. Note that our goal is to explain provide any real data analysis, it's just an explanation of basic techniques for getting to accessing your data, not for using it!
- This is not a Users or Reference Guide Manual for the interactive framework
The pyana users, attention!
As it's been announced earlier, the pyana framework Users of pyana and myana, attention: The
pyana
and myana
frameworks will be phased out at some point. There is a variety of reasons why:
...
As our understanding of what kind of analysis framework
...
works better for
...
the users and for ourselves as developers grew
...
evolved, we realized that we needed to develop a tool which would have
...
an easier interface to the data, a better internal architecture and be
...
easier to maintain and extend for new data types
...
. Hence we
...
developed psana modules framework which has a number of advantages
...
:
- Better
...
- API to the data.
...
- Possibility of writing (mixing) modules in C++ and Python. Modules written in different languages will still see the same data, and they can also exchange data within the framework.
...
- Supporting both XTC and HDF5 and files.
...
- Ability to read the "live" files
...
- while they're being recorded by the DAQ system or data movers
...
- .
...
- Ability to read data from shared memory
...
- on the monitoring machines.
The last two features are opening an interesting the powerful possibility of using psana for real-time monitoring of data while benefiting from reusing
the reusing the same code which might be developed for the traditional OFFLINE processing/analysis.
2.
...
Getting Started
Example Location
All examples can be found under the following directory
The location of examples
We put all examples for today's session at the following diretory:
Code Block |
---|
/reg/g/psdm/tutorials
|
Data
...
Files
In order to make our examples as close to the "real" analysis environment as possible we chose to create 6 pseudo experiments (one per instrument):
...
Each experiment's directory has the standard structure:
Code Block |
---|
ls -al ls -al /reg/d/psdm/XPP/xpptut13/
drwxr-sr-x 8 psdatmgr ps-data 4096 Jun 4 12:02 .
drwxrwsr-x 30 psdatmgr ps-data 4096 Jun 4 12:02 ..
drwxrwsr-x+ 3 psdatmgr ps-data 4096 Jun 5 10:53 calib
drwxrwsr-x+ 2 psdatmgr ps-data 4096 Jun 4 12:02 ftc
drwxr-sr-x 2 psdatmgr ps-data 4096 Jun 6 16:41 hdf5
drwxrwsr-x+ 2 psdatmgr ps-data 4096 Jun 4 12:02 res
drwxrwsr-x+ 2 psdatmgr ps-data 4096 Jun 4 12:03 scratch
drwxr-sr-x 2 psdatmgr ps-data 4096 Jun 6 16:33 xtc
ls -al /reg/d/psdm/XPP/xpptut13/hdf5/
drwxr-sr-x 2 psdatmgr ps-data 4096 Jun 6 16:41 .
drwxr-sr-x 8 psdatmgr ps-data 4096 Jun 4 12:02 ..
-r--r--r-- 1 psdatmgr ps-data 1765036151 Jun 6 16:39 xpptut13-r0178.h5
-r--r--r-- 1 psdatmgr ps-data 394528165 Jun 6 16:39 xpptut13-r0179.h5
-r--r--r-- 1 psdatmgr ps-data 128185688 Jun 6 16:39 xpptut13-r0180.h5
-r--r--r-- 1 psdatmgr ps-data 912811050 Jun 6 16:39 xpptut13-r0181.h5
|
All data files are open for reading by anyone who can log onto PCDS LCLS computers. Moreover, those directories (like scratch/, ftc/) are open for writing by anyone. And yes, one can also see these experiments in the Web Portal.
Setting up
...
the Environment
- make
Make sure you can run X11 applications. Most examples of this tutorial will do a simple visualization.
logYou can pass the -X or -Y argument to ssh to make sure you can forward the screens to your local machine. Log onto any machine of the interactive
analysis clusterspools psananeh or psanafeh
, eg:
Code Block ssh -Y psana
Make
makesure you
sourcessource (just once) one of the following scripts
(depending on which UNIX shell you are using). When using the bash shell:
Code Block . /reg/g/psdm/etc/sitana_env.sh
Or, when using the csh shell family:
Code Block source /reg/g/psdm/etc/sitana_env.csh
- run (just once) the following command which will set up a proper OFFLINE Analysis environment for the latest analysis release:
Code Block sit_setup ana-current
Note that the default shell for most LCLS users is bash. This gives you access to the latest psana software release (there are other commands to use an older, or local release)
At this point you are At this point you must be ready to go. To test that the your environment is set up correctly try running psana without any parameters. If your environment is properly set you should see something like this:
Code Block |
---|
psana
[error:2013-06-06 20:54:44.131:PSAnaApp.cpp:218] no analysis modules specified
|
3. Basic
...
Examples
This section presents a few simple scripts which have been developed to underline the main ideas behind the framework's API. The code of the examples along with a simple HOWTO file can be found at:
Code Block |
---|
/reg/g/psdm/tutorials/common/data_access_methods/
|
Printing
...
Identifiers for all Events in a Run
First try this:
Code Block |
---|
./print_event_id.py
|
Then look at the code. It will do three things:
- import
Import the psanamodule:
Code Block import psana
- open
Open the data set. Note the syntax for the data set specification string:
Code Block dsname = "exp=sxrtut13:run=366" ds = psana.DataSetDataSource(dsname)
- note
Note that by default the framework will look for XTC files at the standard location where all experimental data are supposed to be. If you want to play with
HDFHDF5 (in case
ifthere
areis an HDF5
*version of the run) you
may slightlycan change that string by appending h5in the end:
Code Block dsname = "exp=sxrtut13:run=366:h5"
- the
The next thing which this code will do will be to iterate over all events. At each step you will get a reference to an event object evtand it will extract and print an identifier of the event:
Code Block for ievtnum, evt in enumerate(ds.events()): evtnum = i + 1 id = evt.get(psana.EventId) print "%6d:" % evtnum, id
In the end you're supposed to see something like this:
Code Block |
---|
./print_event_id.py 10: XtcEventId(run=366, time=2013-04-21 04:37:39.343773772-07, fiducials=38877, ticks=329342, vector=19553) 21: XtcEventId(run=366, time=2013-04-21 04:37:39.360457259-07, fiducials=38883, ticks=331442, vector=19554) 32: XtcEventId(run=366, time=2013-04-21 04:37:39.377123777-07, fiducials=38889, ticks=330560, vector=19555) 43: XtcEventId(run=366, time=2013-04-21 04:37:39.393797466-07, fiducials=38895, ticks=329762, vector=19556) 54: XtcEventId(run=366, time=2013-04-21 04:37:39.410477971-07, fiducials=38901, ticks=331204, vector=19557) 65: XtcEventId(run=366, time=2013-04-21 04:37:39.427145705-07, fiducials=38907, ticks=331036, vector=19558) 76: XtcEventId(run=366, time=2013-04-21 04:37:39.443816588-07, fiducials=38913, ticks=329370, vector=19559) 87: XtcEventId(run=366, time=2013-04-21 04:37:39.460499778-07, fiducials=38919, ticks=331414, vector=19560) 98: XtcEventId(run=366, time=2013-04-21 04:37:39.477167658-07, fiducials=38925, ticks=330616, vector=19561) 109: XtcEventId(run=366, time=2013-04-21 04:37:39.493840079-07, fiducials=38931, ticks=329720, vector=19562) 10: ...XtcEventId(run=366, time=2013-04-21 04:37:39.510520348-07, fiducials=38937, ticks=331218, vector=19563) |
Printing a
...
Catalog of
...
Event Components
The sample example can be run like this:
Code Block |
---|
./print_event_keys.py
Components of the first event found in the dataset:
EventKey(type=psana.EvrData.DataV3, src='DetInfo(NoDetector.0:Evr.0)')
EventKey(type=psana.Acqiris.DataDescV1, src='DetInfo(SxrEndstation.0:Acqiris.0)')
EventKey(type=psana.Acqiris.DataDescV1, src='DetInfo(SxrEndstation.0:Acqiris.1)')
EventKey(type=psana.Bld.BldDataEBeamV3, src='BldInfo(EBeam)')
EventKey(type=psana.Bld.BldDataPhaseCavity, src='BldInfo(PhaseCavity)')
EventKey(type=psana.Bld.BldDataFEEGasDetEnergy, src='BldInfo(FEEGasDetEnergy)')
EventKey(type=psana.Bld.BldDataGMDV1, src='BldInfo(GMD)')
EventKey(type=psana.EventId)
EventKey(type=None)
|
Why it's so important to know this information? Because these parameters will tell you:
- whatWhat's inside the event
- and how to extract the corresponding data objects associated with these keys
The above shown output will translate into the following getters (similar to the one which is used in the very first example extracting event identifiers)type and the source of the output shown above will allow you to access the data you need by using the following get functions:
Code Block |
---|
obj = evt.get( psana.EvrData.DataV3, psana.Source('DetInfo(NoDetector.0:Evr.0)'))
obj = evt.get( psana.Acqiris.DataDescV1, psana.Source('DetInfo(SxrEndstation.0:Acqiris.0)'))
obj = evt.get( psana.Acqiris.DataDescV1, psana.Source('DetInfo(SxrEndstation.0:Acqiris.1)'))
obj = evt.get( psana.Bld.BldDataEBeamV3, psana.Source('BldInfo(EBeam)'))
obj = evt.get( psana.Bld.BldDataPhaseCavity, psana.Source('BldInfo(PhaseCavity)'))
obj = evt.get( psana.Bld.BldDataFEEGasDetEnergy, psana.Source('BldInfo(FEEGasDetEnergy)'))
obj = evt.get( psana.Bld.BldDataGMDV1, psana.Source('BldInfo(GMD)'))
obj = evt.get( psana.EventId)
|
Note that event components obtained through this API will be objects of various classes. A full catalog of those objects can be found in the DOXYGEN documentation which is auto-generated from the code of the OFFLINE releases.
...
You already encountered one of these get functions in the very first example where you were extracting the event identifier. The type field indicates what kind of data you are accessing, the source field indicates the instance of that particular detector. In this example there are two acqiris digitizers with the same data type. Note that event components obtained through this API will be objects of various classes. A full catalog of those objects can be found in the DOXYGEN documentation which is auto-generated from the code of the OFFLINE releases.
Iterating over steps and events
Some of our experiments (in particular XPP) are heavily relying on so called scans ( steps, which are also known as "Calibration Transitions*) while taking their data. Each DAQ run has one or many scanssuch steps. Events are recorded in a scope of a scanparticular step. The new framework has a special provision for scans through the iterator of scanssteps. The idea begin behind the following example is:
- open Open a data set which has multiple scans steps in each run
- iterate Iterate over scanssteps
- iterate Iterate over events in each scanstep
This simple application knows about scan step boundaries. More over, this example illustrates how to open a data set composed of many runs (processing a series of runs at once). There are two examples in this set:
Code Block |
---|
./scanssteps_in_runs_xtc.py ./scanssteps_in_runs_hdfhdf5.py |
They both do These two scripts perform the same . The processing, the only subtle difference is being which data format they're suingaccessing. The first example will read XTC files, while the second one will read HDF5 files. When running these examples you should notice differences in their performance. They're explained by different organization of data in XTC vs HDF5 formats. We'll be happy to provide you with an explanation if you'll be interested in it.
4. Instrument-specific examples
This section includes a number of examples which are relevant for different instruments. Their primary meaning is to illustrate how to access data objects which are
XCS: movie
The code of examples is found at:
Code Block |
---|
/reg/g/psdm/tutorials/xcs/princeton_movie/
|
SXR: correlation plots for signals from GDM and Diode
The code of examples is found at:
Code Block |
---|
/reg/g/psdm/tutorials/sxr/gmd_vs_diode/
|
CXI: diffraction patterns on the CSPad detector
The code of examples is found at:
Code Block |
---|
/reg/g/psdm/tutorials/cxi/cspad_imaging/
|
There are three tests in this directory. They demonstrate progressivethree simple steps which could be done to extract and pre-process images taken with the CSPad detector.
- dump_2x1_elements
- the first test will illustrate how individual 2x1 structures can be located from an event and displayed AS-IS w/o any processing. This test also won't use any psana modules.
- frame_reco
- the second example adds one of the standard psana modules in order to reconstruct a full CSPad frame from the corresponding 2x1 components. This test is based on the interactive psana's ability to run events through an optional chain of modules. The modules are specified and configured via an external configuration file 'frame_reco.cfg' which has the usual psana syntax.
- frame_reco_calib
- the last example will add one more module to calibrate (pedestals subtraction and gain correction) reconstructed CSPad images. The modules are configured in an external configuration file 'frame_reco_calib.cfg'
Please, run these test in the same sequence as they're explained above:
Code Block |
---|
./dump_2x1_elements.py
./frame_reco.py
./frame_reco_calib.py
|
5. Doing something less trivial
Custom HDF5 translator written by a user
A problem:
- Let's suppose we need to write a data extraction tool to extract images (CSPad, Princeton, etc) from XTC files and make then available for further analysis in Matlab. At this point we should already know how to get images from the raw files using ipsana. Now the only remaining problem is to store them in some form which may be readable from Matlab (assuming we're looking at some reasonable performance).
Perhaps the best way to solve the problem would be to store those images in an HDF5 file using some library. And this is what this example offers. It uses the PyTables package to dump numpy arrays into an out put files. This package is known for its simple API which doesn't require a user to learn the low-level library h5py.
The code of example is found at:
...
will notice that the HDF5 version is significantly faster. This is due to the fact that we don't yet support indexing for XTC files.
Extracting values of EPICS variables
In this example we will demonstrate how to access a value of an EPICS variable. The application will monitor changes in the value and report event numbers at which changes happen:
Code Block |
---|
./scan_epics.py
|
Here are the relevant lines of code:
Code Block |
---|
epics = ds.env().epicsStore()
pv = epics.getPV('VGCP:FEE1:311:P').data()[0]
|
For many pv's, there is only one element in the data() array. To access the first element, one can also do:
Code Block | ||
---|---|---|
| ||
pv = epics.value('VGP:FEE1:311:P')
pv = epics.value('VGP:FEE1:311:P',0)
pv = epics.getPV('VGP:FEE1:311:P').value(0) |
4. Instrument Specific Examples
This section includes a number of examples which are relevant to different instruments. Their primary goal is to illustrate how to access data objects specific to each instrument.
XCS
Princeton Movie
The code for these examples is found at:
Code Block |
---|
/reg/g/psdm/tutorials/xcs/princeton_movie/
|
SXR
Correlation Plots for Signals from GDM and Diode
The code for these examples is found at:
Code Block |
---|
/reg/g/psdm/tutorials/sxr/gmd_vs_diode/
|
CXI
Diffraction Patterns on the CSPad Detector
The code for these examples is found at:
Code Block |
---|
/reg/g/psdm/tutorials/cxi/cspad_imaging/
|
There are three examples in this directory showing an increasingly complex processing of the CSPad detector. These example also introduce the ability to provide a configuration file describing the parameters of the analysis. These configuration files follow the usual psana syntax.
- dump_2x1_elements: The first test illustrates how individual 2x1 structures can be located from an event and displayed AS-IS w/o any processing. This test won't use any psana modules.
- frame_reco: The second example adds one of the standard psana modules in order to reconstruct a full CSPad frame from the corresponding 2x1 components. This test is based on the interactive psana's ability to run events through an optional chain of modules. The modules are specified and configured via an external configuration file
frame_reco.cfg
. - frame_reco_calib: The last example will add one more module to calibrate (pedestals subtraction and gain correction) reconstructed CSPad images. The modules are configured in an external configuration file
frame_reco_calib.cfg
Please, run these test in the order described above:
Code Block |
---|
./dump_2x1_elements.py
./frame_reco.py
./frame_reco_calib.py
|
5 Wrapping up
Where to look for support
- send e-mail to: pcds-help@slac.stanford.edu
Documentation
- DOXYGEN documentation for psana
- Batch version of psana
- psana - Module Catalog - comprehensive catalog of psana modules in the latest Analysis Release
- psana - Module Examples