Page History
Include Page PSDM:PageMenuBegin PSDM:PageMenuBegin Table of Contents Include Page PSDM:PageMenuEnd PSDM:PageMenuEnd
Infrastructure
...
How do I use the LCLS batch farm?
Follow the instructions here: Submitting Batch Jobs
How do I keep programs running if a ssh connection fails
See if you can use the LSF batch nodes for your work. If not, three unix programs to help with this are tmux, nohup and screen. None of these programs will preserve a graphical program, or a X11 connection, so run your programs in terminal mode.
tmux
For example, with tmux, if one does
...
If you lose the connection to psanacs040, you can go back to that node and reattach:
ssh psexport
ssh psanacs040
tmux attach
You need to remember the node you ran tmux on. If you are running matlab, you can run the matlab license script with the --show-users parameter to see where you are running it:
/reg/common/package/scripts/matlic --show-users
nohup
You could run a batch process with nohup (no hangup) as follows
nohup myprogram
For example, suppose we want to run a Python script that prints to the screen and save its output (the below syntax is for the bash shell):
nohup python myscript.py > myoutput 2>&1 &
Here we are capturing the output of the program in myoutput, along with anything it writes to stderr (the 2>&1), then putting it in the background. The job will persist after you logout. You can take a look at the output in the file myoutput the next day. As with tmux you will need to remember the node you launched nohup on.
Why did my batch job failed? I'm getting 'command not found'
Before running your script, make sure you can run something, for instance do
bsub -q psnehq pwd
(substitute the appropriate queue for psnehq). If you created a script and are running
bsub -q psnehq myscript
Then it maybe that the current directory is not in your path, run
bsub -q psnehq ./myscript
Check that myscript is executable by yourself, check that you have the correct #! line to start the script.
Psana
Topics specific to Psana
Where is my epics variable?
Make sure it is a epics variable - it may be a control monitor variable. An easy way to see what is in the file is to use psana modules that dump data. For instance:
psana -m psana_examples.DumpEpics exp=cxitut13:run=0022
will show what epics variables are defined. Likewise
psana -m psana_examples.DumpControl exp=xpptut13:run=0179
will almost always show what control variables are defined. It defaults to use the standard Source "ProcInfo()" for control data. It is possible (though very unlikely) for control data to come from a different source. One can use the EventKeys module to see all Source's present, and then specify the source for DumpControl through a config file.
How do I access data inside a Psana class?
Look for an example in the psana_examples package that dumps this class. There should be both a C++ and Python module that dumps data for the class.
How do I find out the experiment number (expNum) or experiment?
Psana stores both the experiment and expNum in the environment object - Env that Modules are passed, or that one obtains from the DataSource in interactive Psana. See Interactive Analysis document and the C++ reference for Psana::Env
Why isn't there anything in the Psana Event?
This may be due to a problem in the DAQ software that was used during the experiment. The DAQ software may have incorrectly been setting the L3 trim flag. This flag is supposed to be set for events that should not processed (perhaps they did not meet a scientific criteria involving beam energy). When the flag is set, there should be very little in the xtc datagram - save perhaps epics updates. Psana (as of release ana-0.10.2 from October 2013) will by default not deliver these events to the user. The bug is that the flag was set when there was valid data. To force psana to look inside datagram's where L3T was set, use the option l3t-accept-only. To use this option from the command line do:
...
It seems that for as much as 5% of the time, CsPad DataV2 is not in the Event
The only distinction between CsPad DataV1 and DataV2 is the sparsification of particular sections as given in the configuration object. That is DataV2 may be sparser. The actual hardware is kicking out DataV1 but the DAQ event builder is making a DataV2 when it can. Sometimes the DAQ sends the original DataV1 instead of the DataV2. This can be due to limited resources, in particular competition with resources required for compressing cspad in the xtc files. If you do not find a DataV2 in the Event, look for a DataV2
Hdf5
Topics specific to hdf5
Why is there both CsPad DataV2 and CsPad DataV1 in the translation?
The only distinction between CsPad DataV1 and DataV2 is the sparsification of particular sections as given in the configuration object. That is DataV2 may be sparser. The actual hardware is kicking out DataV1 but the DAQ event builder is making a DataV2 when it can. Sometimes the DAQ sends the original DataV1 instead of the DataV2. This can be due to limited resources, in particular competition with resources required for compressing cspad in the xtc files.
How do I write hdf5 files from C++ or Python
Python:
From Python we recommend h5py. For interactive Python, an example is found at Using h5py to Save Data.
For Python you can also use pytables. This is installed in the analysis release. Do
import tables
In your Python code.
C++
If developing a Psana module to process xtc, consider splitting your module into a C++ module which puts ndarrays in the event store, and a Python module which retrieves them and writes the hdf5 file using h5py.
You can also work with the C interface to hdf5. hdf5 is installed as a package in the analysis release. From your C++ code, do
...
A tip for learning hdf5 is to run example programs from an 'app' subdirectory of your package. For example, if you create an analysis release and a package for yourself, create an app subdirectory to that package and put an example file there:
~/myrelease/mypackage/app/hdf5_example.c
Now run 'scons' from the ~/myrelease directory, and then run hdf5_example.
Psana Modules - Using the Translator:
The Psana ddl based Translator can be used to write ndarrays, strings and a few simple types that C++ modules register. These will be organized in the same groups that we use to translate xtc to hdf5. Datasets with event times will be written as well. To use this, create a psana config file that turns off the translation of all xtc types but allows translation of ndarrays and strings. An example cfg file is here: psana_translate_noxtc.cfg You would just change the modules and files parameters for psana and the output_file parameter to Translator.H5Output. Load modules before the translator that put ndarrays into the event store. The Translator will pick them up and write them to the hdf5 file
TimeTool
Here we cover topics specific to the offline TimeTool module. The TimeTool results can be obtained in one of two ways depending on the experimental setup. The first is through EPICs PV's that are calculated during data acquisition. The second is during offline analysis using the psana module TimeTool.Analyze. Documentation on the psana TimeTool modules can be found in the psana - Module Catalog.
SXR case study: reference/signal in different runs, plots to check
...
Problems accessing data, or data seems to have disappeared
Two things to check:
- Have you been given access to view the data?
- Has the data been removed due to the data retention policy?
For the first, new users to an experiment need to ask the experiment POC to add them to the experiment. After this is done, you must log out and log back in for the change to take affect.
For the second, when analysis code that used to work stops working, check to see if the xtc is visible. For example, if you are analyzing run 68 of xpptut13, take a look in the xtc directory for that experiment, i.e:
Code Block |
---|
ls /reg/d/psdm/xpp/xpptut13/xtc |
If the directory is visible but run 68 is not there, it maybe that the data was removed due to the Data Retention Policy Version 2. The data is still available on disk and can be restored using the Web Portal of Experiments.
If the xtc directory is not visible, make sure you are running on a node that can see the data (i.e, you are on a psana node, rather than a psdev or pslogin node). If it is still not visible, email pcds-help@slac.stanford.edu.
How do I keep programs running if a ssh connection fails
See if you can use the LSF batch nodes for your work. If not, see if nomachine technology will work. Otherwise, there are three unix programs to help with this are tmux, nohup and screen. None of these programs will preserve a graphical program, or a X11 connection, so run your programs in terminal mode.
tmux
For example, with tmux, if one does
ssh psexport
ssh psana
# suppose we have landed on psanacs040 and that there is a matlab license here
tmux
matlab --nosplash --nodesktop
If you lose the connection to psanacs040, you can go back to that node and reattach:
ssh psexport
ssh psanacs040
tmux attach
You need to remember the node you ran tmux on. If you are running matlab, you can run the matlab license script with the --show-users parameter to see where you are running it:
/reg/common/package/scripts/matlic --show-users
nohup
You could run a batch process with nohup (no hangup) as follows
nohup myprogram
For example, suppose we want to run a Python script that prints to the screen and save its output (the below syntax is for the bash shell):
nohup python myscript.py > myoutput 2>&1 &
Here we are capturing the output of the program in myoutput, along with anything it writes to stderr (the 2>&1), then putting it in the background. The job will persist after you logout. You can take a look at the output in the file myoutput the next day. As with tmux you will need to remember the node you launched nohup on.
Why did my batch job fail? I'm getting 'command not found'
Before running your script, make sure you can run something, for instance do
bsub -q psnehq pwd
(substitute the appropriate queue for psnehq). If you created a script and are running
bsub -q psnehq myscript
Then it maybe that the current directory is not in your path, run
bsub -q psnehq ./myscript
Check that myscript is executable by yourself, check that you have the correct #! line to start the script.
sit_setup fails in script using ssh
Users have run into issues in the following scenario
- writing a script that uses ssh to get to another node
- uses sit_setup to set up the environment for that node (in the script)
The issue is that sit_setup is an alias (defined by /reg/g/psdm/etc/ana_env.sh). In bash aliases are not expanded by default in non-interactive shells so you have two options:
- '. /reg/g/psdm/bin/sit_setup.sh' instead of sit_setup
- change the shell behavior by saying 'shopt expand_aliases'
Typically option 1) works best.
Psana
Topics specific to Psana
Where is my epics variable?
Make sure it is a epics variable - it may be a control monitor variable. An easy way to see what is in the file is to use psana modules that dump data. For instance:
psana -m psana_examples.DumpEpics exp=cxitut13:run=0022
will show what epics variables are defined. Likewise
psana -m psana_examples.DumpControl exp=xpptut13:run=0179
will almost always show what control variables are defined. It defaults to use the standard Source "ProcInfo()" for control data. It is possible (though very unlikely) for control data to come from a different source. One can use the EventKeys module to see all Source's present, and then specify the source for DumpControl through a config file.
How do I access data inside a Psana class?
Look for an example in the psana_examples package that dumps this class. There should be both a C++ and Python module that dumps data for the class.
How do I find out the experiment number (expNum) or experiment?
Psana stores both the experiment and expNum in the environment object - Env that Modules are passed, or that one obtains from the DataSource in interactive Psana. See Interactive Analysis document and the C++ reference for Psana::Env
Why isn't there anything in the Psana Event?
This may be due to a problem in the DAQ software that was used during the experiment. The DAQ software may have incorrectly been setting the L3 trim flag. This flag is supposed to be set for events that should not processed (perhaps they did not meet a scientific criteria involving beam energy). When the flag is set, there should be very little in the xtc datagram - save perhaps epics updates. Psana (as of release ana-0.10.2 from October 2013) will by default not deliver these events to the user. The bug is that the flag was set when there was valid data. To force psana to look inside datagram's where L3T was set, use the option l3t-accept-only. To use this option from the command line do:
psana -o psana.l3t-accept-only=0 ...
Or you can add the option to your psana configuration file (if you are using one):
[psana]
l3t-accept-only=0
It seems that for as much as 5% of the time, CsPad DataV2 is not in the Event
The only distinction between CsPad DataV1 and DataV2 is the sparsification of particular sections as given in the configuration object. That is DataV2 may be sparser. The actual hardware is kicking out DataV1 but the DAQ event builder is making a DataV2 when it can. Sometimes the DAQ sends the original DataV1 instead of the DataV2. This can be due to limited resources, in particular competition with resources required for compressing cspad in the xtc files. If you do not find a DataV2 in the Event, look for a DataV2
How do I set psana verbosity from the config file?
Most all psana options can be set from both the config file as well as the command line. Unfortunately verbosity cannot be set from a config file. That is
psana -v -v ...
has no config file equivalent. This is because verbosity is a function of the Message service that psana is a client of, rather than server for. Unfortunately this makes it difficult to turn on debugging messages from within a python script that is run as
python myscript.py
However one can configure the message logging service through the MSGLOGCONFIG environment variable. In particular
MSGLOGCONFIG=trace python myscript.py
or
MSGLOGCONFIG=debug python myscript.py
turns on trace or debug level MsgLog messages. The above examples were tested with the bash shell. For full details on configuring the MsgLogger through the MSGLOGCONFIG environment variables, see https://pswww.slac.stanford.edu/swdoc/releases/ana-current/doxy-all/html/group__MsgLogger.html
Strange Results after doing Math with data in Python
One thing to bear in mind with the data in the Python interface, the detector data is almost always returns as a numpy array of an integral type. For example, after getting a waveform of Acqiris data, if you were to print it out during an interactive Python session, you might see
waveform
array([234,53,5,...,324], dtype=np.int16)
Note the dtype, this data is two byte signed integers. In particular if a value of the waveform is over 32000 and you add 10000 to it, those values will wrap around and become negative. It may be a good idea, before doing any math with the data, to convert it to floats:
waveform = waveform.astype(np.float)
Specialized Psana Options, liveTimeOut, firstControlStream, etc
There are a number of specialized options in psana that we do not generally document in the main user documentation. These are options that engineers, points of contact (POC's), or possibly users may need to deal with special circumstances. These are described in the table below:
package.module.option or psana.option | data type | option description |
---|---|---|
PSXtcInput.XtcInputModule.liveTimeout | integer | live time out in seconds run running in live mode (defaults to 120) |
PSXtcInput.XtcInputModule.runLiveTimeout | integer | starting in ana-0.13.17, live time out when waiting for a new run (defaults to 0) |
psana.first_control_stream | integer | which stream starts the control or IOC streams (xtcav, etc) defaults to 80 |
psana.l3t-accept-only | bool | when true, checks for trimmed datagrams. If the datagram is trimmed, it does not trigger an event for the user. Trimmed datagrams should be empty. There was a case where trimming was not set up properly, and we needed to turn this off to see the data. Note - this is different then the l3t pass/fail that POC's may setup - though one does not expect trimmed data unless the l3t is a fail. |
psana.allow-corrupt-epics | bool | for data from before approximately July 2014 when the DAQ epics archiver recorded different variables at different rates, the index files do not contain enough information to properly get access the correct epics values. When this happens, and the user is running in index mode, psana will exit with a fatal error. Setting this flag to True will allow psana to analyze these runs (in index mode) but some of the epics data could be corrupted. |
Hdf5
Topics specific to hdf5
Why is there both CsPad DataV2 and CsPad DataV1 in the translation?
The only distinction between CsPad DataV1 and DataV2 is the sparsification of particular sections as given in the configuration object. That is DataV2 may be sparser. The actual hardware is kicking out DataV1 but the DAQ event builder is making a DataV2 when it can. Sometimes the DAQ sends the original DataV1 instead of the DataV2. This can be due to limited resources, in particular competition with resources required for compressing cspad in the xtc files.
How do I write hdf5 files from C++ or Python
Python:
From Python we recommend h5py. For interactive Python, an example is found at Using h5py to Save Data.
For Python you can also use pytables. This is installed in the analysis release. Do
import tables
In your Python code.
C++
If developing a Psana module to process xtc, consider splitting your module into a C++ module which puts ndarrays in the event store, and a Python module which retrieves them and writes the hdf5 file using h5py.
You can also work with the C interface to hdf5. hdf5 is installed as a package in the analysis release. From your C++ code, do
#include "hdf5/hdf5.h"
A tip for learning hdf5 is to run example programs from an 'app' subdirectory of your package. For example, if you create an analysis release and a package for yourself, create an app subdirectory to that package and put an example file there:
~/myrelease/mypackage/app/hdf5_example.c
Now run 'scons' from the ~/myrelease directory, and then run hdf5_example.
Psana Modules - Using the Translator:
The Psana ddl based Translator can be used to write ndarrays, strings and a few simple types that C++ modules register. These will be organized in the same groups that we use to translate xtc to hdf5. Datasets with event times will be written as well. To use this, create a psana config file that turns off the translation of all xtc types but allows translation of ndarrays and strings. An example cfg file is here: psana_translate_noxtc.cfg You would just change the modules and files parameters for psana and the output_file parameter to Translator.H5Output. Load modules before the translator that put ndarrays into the event store. The Translator will pick them up and write them to the hdf5 file
Chunks, compression, and why is my random access algorithm so slow?
Be default, Hdf5 files are translated in compressed chunks. The compression (standard gzip, with deflate=1 (the range in [0,9])) reduces file size by about 50%. The chunk size varies with the data type. The chunk policy focuses on not having too many chunks, as we believe this degrades performance. Typically we shoot for chunks of 16MB, however for full cspad, this is only 4 objects per chunk - so we default to using 22 objects per chunk - or 100MB chunks. This is fine for a program that linearly reads through a hdf5 file, or a parallel program that divides the file into sections - i.e, start, middle, end, but it is not optimal for a random access analysis.If you read one cspad, you read the other 21 in its chunk, and decompressed the whole 100MB chunk.
In terms of how this plays with the filesystem cache, and hdf5 cache, the filesystem will cache the original data in the hdf5 - so if you revisit cspad from the same chunk, the compressed data may be cached by the filesystem, but hdf5 will have to decompress the whole chunk again. Now hdf5 maintains it's own chunk cache which stores recently used chunks (uncompressed). There is one cache global cache, but one can also create caches on a per dataset basis. The Translator creates datasets with a cache to hold 3 chunks - so a 300MB cache for a cspad dataset. This information is available to any program that reads the hdf5 file, but whether or not it is used is unclear. If you are writing your own program to read an hdf5 file, you can set the overall chunk cache, as well as per dataset cache's explicitly.
There are options with translation to control the objects per chunk. One can also turn off compression when translating.
TimeTool
The TimeTool results can be obtained in one of two ways depending on the experimental setup. The first is directly during data acquisition, the second is during offline analysis using the psana module TimeTool.Analyze. Regrading data acquisition, for data recorded prior to Oct 13, 2014, the timetool results were always recorded as EPICS PV's. After Oct 13, 2014, they are recorded in their own data type: TimeTool::DataV*. The version increments as the time tool develops. For data recorded up to around November 2014, this was DataV1. Around December of 2014 it changed to DataV2. Regarding offline analysis with the TimeTool.Analyze psana module, similarly to the experiment data files, this module puts a TimeTool::DataV* object in the event store depending on the version of the software. Please see the documentation on the TimeTool in the Psana Module Catalog - TimeTool Package
An error occurred during authentication
This is a common error when a user has not obtained an AFS token. You can obtain a token with:
kinit
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
#!/bin/env python
__doc__='''
Below we go over a script for generated offline TimeTool data for an
sxr experiment. For this experiment, run 144 was done with the beam blocked,
and run 150 with the beam on. The laser is always one in both runs.
This requires some configuration of the TimeTool. We run the TimeTool.Analyze
module on run 144 - telling it that the beam is always off. Analayze builds up a
reference. We further configure Analyze to save the reference. We also save a
averaged background image for our own interactive plotting later.
We then run TimeTool.Analyze on run 150. We save the resulting timetool values
to an hdf5 file using the psana xtc to hdf5 translator.
Finally, to check the work, we process run 150 in index mode. We load the
time tool values from the h5 file and plot them against the opal, after subtracting
our background image.
Here are instructions for setting up to run on this sxr experiment.
Users of other experiments will not be able to run on this data.
To get setup, after setting up the analysis software environment, do
newrel ana-current myrel
cd myrel
sit_setup
addpkg TimeTool HEAD # to get the put_ndarrays option, not needed if
# ana-current is > 0.13.0
cp *thisFile* ttanalyze.py # see confluence Data Analysis FAQ for this file
newpkg mypkg
mkdir mypkg/src
cp *ttlib.py* mypkg/src/ttlib.py # see confluence Data Analysis FAQ for this file
scons
chmod u+x ttanalyze.py
Now run
./ttanalyze.py
this should produce the files
ttref_sxrd5814_r0144.txt # reference that TimeTool.Analayze produced
ttref_sxrd5814_r0144.npy # our own background reference for plotting
tt_sxrd5814_r0150.h5 # the h5 file with the timetool results for r150
This script builds a background from the first 300 events in run 144,
and processes the first 500 events in run 150.
A recursive listing of the h5 file should show, among other groups:
h5ls -r tt_sxrd5814_r0150.h5
/Configure:0000/Run:0000/CalibCycle:0000/ndarray_float64_1/noSrc__TTANA:AMPL/data Dataset {500/Inf}
/Configure:0000/Run:0000/CalibCycle:0000/ndarray_float64_1/noSrc__TTANA:AMPL/time Dataset {500/Inf}
/Configure:0000/Run:0000/CalibCycle:0000/ndarray_float64_1/noSrc__TTANA:AMPLNXT/data Dataset {500/Inf}
/Configure:0000/Run:0000/CalibCycle:0000/ndarray_float64_1/noSrc__TTANA:AMPLNXT/time Dataset {500/Inf}
/Configure:0000/Run:0000/CalibCycle:0000/ndarray_float64_1/noSrc__TTANA:FLTPOS/data Dataset {500/Inf}
/Configure:0000/Run:0000/CalibCycle:0000/ndarray_float64_1/noSrc__TTANA:FLTPOS/time Dataset {500/Inf}
The data datasets are the TimeTool values. FLTPOS is the main result.
The time datasets store the event id's for the data - the seconds, nanoseconds and fiducials.
After running the script once, the second time the script is run, it will produce
interactive plots.
comments below continue to annotate the script.
'''
import os
import sys
import mypkg.ttlib as ttlib
experiment = 'sxrd5814'
refRun = 144
refNumEvents = 300 # set to None to process the whole run
analRun = 150
analNumEvents = 500 # set to None to process the whole Run
# for this experiment, the xtcav data was not correct. We are excluding
# it by setting stream=0-12. Including the xtcav data (which is in stream 80)
# will generate some warnings but otherwise should pose no problem.
# It is cleaner to not include xtcav/stream 80.
refDataSource = 'exp=%s:run=%4.4d:stream=0-12' % (experiment, refRun)
analDataSource = 'exp=%s:run=%4.4d:stream=0-12' % (experiment, analRun)
# We specify the index mode for the interactive plotting so that we
# can jump around from event to event. Note the :idx below
interDataSource = 'exp=%s:run=%4.4d:stream=0-12:idx' % (experiment, analRun)
# output file for TimeTool.Analyze during its first run when we
# save the reference
timeToolRefFile = 'ttref_%s_r%4.4d.txt' % (experiment, refRun)
# save the background for our own work
numpyRefFile = 'ttref_%s_r%4.4d.npy' % (experiment, refRun)
# The script does three things:
#
# 1. build reference -> outputs two files
# 2. analyze/save timetool values -> outputs one file
# 3. plot
#
# If none of the files are present, it will do 1 & 2 and then ask you to
# rerun to do step 3.
# If files for 1 are present, it will skip step 1. If files for 2 are present
# it will skip step 2. To force a redo even if files are present, set
redo = False # redo to True
# specify source alias for opal. One can check what sources are present in the data
# by using EventKeys, i.e.,
# psana -n 1 -m EventKeys exp=sxrd5814:run=144
opal_src = 'TSS_OPAL'
# a prefix for the timetool data, can be anything, but must be the same between
# steps
put_key = 'TTANA'
# name of h5 file to save timeTool valeus
ttResultFile = 'tt_%s_r%4.4d.h5' % (experiment, analRun)
overwrite = True # by default overwrite timetool results that we write in the h5 file.
interactive = True # set to false to not analyze interactively.
# this is done after analyze and the time tool results are written out
plot_offset_x = 26.0 # the flotpos seems to track a little behind the line, add this to the
# x position to visual see better tracking
# this is from the sxr_timetool.cfg file.
# weights to turn the projection from a step function into a signal with a peak. Run across the
# signal.
sxr_timetool_weights = '0.00940119 -0.00359135 -0.01681714 -0.03046231 -0.04553042 -0.06090473 -0.07645332 -0.09188818 -0.10765874 -0.1158105 -0.10755824 -0.09916765 -0.09032289 -0.08058788 -0.0705904 -0.06022352 -0.05040479 -0.04144206 -0.03426838 -0.02688114 -0.0215419 -0.01685951 -0.01215143 -0.00853327 -0.00563934 -0.00109415 0.00262359 0.00584445 0.00910484 0.01416929 0.0184887 0.02284319 0.02976289 0.03677404 0.04431778 0.05415214 0.06436626 0.07429347 0.08364909 0.09269116 0.10163601 0.10940983 0.10899065 0.10079016 0.08416471 0.06855799 0.05286105 0.03735241 0.02294275 0.00853613'
didRefOrAnalyze = False
# all the work will be in functions in ttlib:
if ( (not os.path.exists(timeToolRefFile)) or
(not os.path.exists(numpyRefFile)) or
redo):
ttlib.makeReferenceFiles(refDataSource,
timeToolRefFile,
numpyRefFile,
refNumEvents,
opal_src,
put_key,
sxr_timetool_weights)
didRefOrAnalyze = True
if ( (not os.path.exists(ttResultFile)) or
redo):
ttlib.analyze(analDataSource,
overwrite,
timeToolRefFile,
ttResultFile,
analNumEvents,
opal_src,
put_key,
sxr_timetool_weights)
didRefOrAnalyze = True
if interactive:
if didRefOrAnalyze:
print "Rerun script for interactive plots"
sys.exit(0)
ttlib.interactivePlots(interDataSource,
ttResultFile,
opal_src,
put_key,
sxr_timetool_weights,
plot_offset_x,
numpyRefFile)
|
Second, library code for a package in the release:
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
import os
import sys
import numpy as np
import psana
import h5py
import matplotlib.pyplot as plt
import IPython
# first we define some options for TimeTool that we want to use between
# steps. These are from TimeTool/data/sxr_timetool.cfg. Typically options are
# defines in a psana config file such as sxr_timetoo.cfg. However one can also
# define the options through psana's python interface as we are doing. Note -
# this configuration applies to all datasources that are created.
# The ROI where the TimeTool does it's analysise.
# This will be plotted in a white box. Users can change this if it looks
# like the signal to measure is somewhere else on the opal.
sig_roi_x = [0, 1023]
sig_roi_y = [425, 724]
sb_avg_fraction = 0.05 # Rolling average convergence factor (1/Nevents)
# The reference that TimeTool.Analyze uses is built up from all the
# previous shots using a rolling average. The default is for the TimeTool
# to uses the last no beam shot for the reference weighed at 1.0 (meaning
# it won't use the previous shots). For this behavior, you would set the
# option ref_avg_fraction = 1.0. However we want to build an average out
# of all the data in run 144. So we will use a much smaller value:
ref_avg_fraction = 0.01 # This weighs the most recently seen shot
# at 1% and the previous shots at 99%
# Something specific with this data is telling TimeTool.Analyze that the
# beam is always off for run 144 so that it builds a reference from everything,
# and that it is always on for run 150, so it calculates timetool values for
# everything. The TimeTool will look at Evr Codes to determine beam on/off
# laser on/off. We need to tell it what Evr codes are associated with what.
#
# Often specific evr codes are programmed to fire with the beam or laser. In
# this case no evr codes were programmed, just which run was a no-beam and which
# run was a beam+laser run are known by the scientists. Hence we need to identify
# an evr code that is always present (this is code 140). For run 144, we say
# code 140 is associated with no beam. Then TimeTool.Analyze builds a reference
# from all the data. For run 150, we say code 0 is associated with no beam. Evr code
# 0 is illegal and will never be present. Hence TimeTool.Analyze treat every
# event as having beam, and laser.
#
# To see evr codes in the data, one can use the following psana module:
#
# psana -n 40 -m psana_examples.DumpEvr exp=sxrd5814:run=144
evrSrc = psana.Source('DetInfo(NoDetector.0:Evr.0)')
eventCodeWhichDoesntExist = 0
evrCodeAlwaysPresent = 140
def makeReferenceFiles(dataSource,
ttRefFile,
numpyRefFile,
refNumEvents,
opal_src,
put_key,
sxr_timetool_weights):
if os.path.exists(ttRefFile):
os.unlink(ttRefFile)
if os.path.exists(numpyRefFile):
os.unlink(numpyRefFile)
global eventCodeWhichDoesntExist
global evrCodeAlwaysPresent
psana.setOption('TimeTool.Analyze.get_key',opal_src)
psana.setOption('TimeTool.Analyze.put_key',put_key)
# this is a new option to get the time tool data as ndarrays, visible by Python
psana.setOption('TimeTool.Analyze.put_ndarrays',True)
# we went to build a reference from all shots.
psana.setOption('TimeTool.Analyze.eventcode_nobeam',evrCodeAlwaysPresent)
#don't skip any events, laser is always on, set to jnk event code
psana.setOption('TimeTool.Analyze.eventcode_skip',eventCodeWhichDoesntExist)
# don't use ipm to detect presence of beam, we alway get beam
psana.setOption('TimeTool.Analyze.ipm_get_key','')
psana.setOption('TimeTool.Analyze.calib_poly','0 1 0') # no calibration
psana.setOption('TimeTool.Analyze.projectX',True)
psana.setOption('TimeTool.Analyze.proj_cut',0)
global sig_roi_x
global sig_roi_y
psana.setOption('TimeTool.Analyze.sig_roi_x','%d %d' % (sig_roi_x[0],sig_roi_x[1]))
psana.setOption('TimeTool.Analyze.sig_roi_y','%d %d' % (sig_roi_y[0],sig_roi_y[1]))
# no side band analysis
psana.setOption('TimeTool.Analyze.sb_roi_x','')
psana.setOption('TimeTool.Analyze.sb_roi_y','')
psana.setOption('TimeTool.Analyze.sb_avg_fraction',sb_avg_fraction)
psana.setOption('TimeTool.Analyze.ref_avg_fraction',ref_avg_fraction)
psana.setOption('TimeTool.Analyze.ref_load', '')
# store the final reference
psana.setOption('TimeTool.Analyze.ref_store', ttRefFile)
psana.setOption('TimeTool.Analyze.weights', sxr_timetool_weights)
# when the psana option events is 0, psana goes through all the data
if refNumEvents is None:
psana.setOption('psana.events',0)
else:
psana.setOption('psana.events',refNumEvents)
psana.setOption('modules','TimeTool.Analyze')
ds = psana.DataSource(dataSource)
events = ds.events()
reportEventInterval = 50
global evrSrc
# we will also build up a average of these images to use in interactive plotting.
numBkg = 0
bkg = None
opalPsanaSource = psana.Source(opal_src)
print "-----Building Reference:-----"
for ii, evt in enumerate(events):
evrData = evt.get(psana.EvrData.DataV3, evrSrc)
if evrData is None:
print " no evr data in event %d, could be xtcav data, set stream=0-12 in datasource" % ii
continue
hasAlwaysPresent = False
hasDoesntExist = False
for fifoEvent in evrData.fifoEvents():
if fifoEvent.eventCode() == evrCodeAlwaysPresent:
hasAlwaysPresent = True
if fifoEvent.eventCode() == eventCodeWhichDoesntExist:
hasDoesntExist = True
if not hasAlwaysPresent:
print " Unexpected: evr code %d not found in fifoList - should always be there" % evrCodeAlwaysPresent
if hasDoesntExist:
print " Unexpected: evr code %d found in fifoList - it should not be present" %eventCodeWhichDoesntExist
opal = evt.get(psana.Camera.FrameV1, opalPsanaSource)
if opal is not None:
if bkg is None:
bkg = np.array(opal.data16(),np.float)
numBkg = 1
else:
bkg *= numBkg/float((numBkg+1))
bkg += ((1.0/float((numBkg+1))) * np.array(opal.data16(),np.float))
numBkg += 1
if ii % reportEventInterval == 0:
print " event: %d" % ii
np.save(numpyRefFile,bkg)
def analyze(dataSource,
overwrite,
ttRefFile,
ttResultFile,
analNumEvents,
opal_src,
put_key,
sxr_timetool_weights):
assert os.path.exists(ttRefFile), "time tool reference file doesn't exist"
if os.path.exists(ttResultFile):
if (overwrite):
os.unlink(ttResultFile)
else:
raise Exception("analyze: ttResultFile %s already exists and overwrite is False" % ttResultFile)
global eventCodeWhichDoesntExist
psana.setOption('TimeTool.Analyze.get_key',opal_src)
psana.setOption('TimeTool.Analyze.put_key',put_key)
psana.setOption('TimeTool.Analyze.put_ndarrays',True)
# we need to override any option that was set above as psana remembers config options
# now we want TimeTool.Analyze to always see laser on and beam
psana.setOption('TimeTool.Analyze.eventcode_nobeam',eventCodeWhichDoesntExist)
#don't skip any events, laser is always on, set to jnk event code
psana.setOption('TimeTool.Analyze.eventcode_skip',eventCodeWhichDoesntExist)
# don't use ipm to detect presence of beam, we alway get beam
psana.setOption('TimeTool.Analyze.ipm_get_key','')
psana.setOption('TimeTool.Analyze.calib_poly','0 1 0') # no calibration
psana.setOption('TimeTool.Analyze.projectX',True)
psana.setOption('TimeTool.Analyze.proj_cut',0)
global sig_roi_x
global sig_roi_y
psana.setOption('TimeTool.Analyze.sig_roi_x','%d %d' % (sig_roi_x[0],sig_roi_x[1]))
psana.setOption('TimeTool.Analyze.sig_roi_y','%d %d' % (sig_roi_y[0],sig_roi_y[1]))
psana.setOption('TimeTool.Analyze.sb_roi_x','')
psana.setOption('TimeTool.Analyze.sb_roi_y','')
psana.setOption('TimeTool.Analyze.sb_avg_fraction',sb_avg_fraction)
psana.setOption('TimeTool.Analyze.ref_avg_fraction',ref_avg_fraction)
# load reference from previous step
psana.setOption('TimeTool.Analyze.ref_load', ttRefFile)
# do not store a reference
psana.setOption('TimeTool.Analyze.ref_store', '')
psana.setOption('TimeTool.Analyze.weights', sxr_timetool_weights)
# configure the Translator to only translate the ndarrays that TimeTool.Analyze will emit
psana.setOption('Translator.H5Output.output_file',ttResultFile)
# turn off compression and shuffle, shuffle is only useful if compression is on
psana.setOption('Translator.H5Output.deflate',-1)
psana.setOption('Translator.H5Output.shuffle',False)
psana.setOption('Translator.H5Output.overwrite',overwrite)
# key options: exclude all the psana data and epics data
psana.setOption('Translator.H5Output.type_filter','exclude psana')
psana.setOption('Translator.H5Output.store_epics','no')
# now only ndarrays will be translated
if analNumEvents is None:
psana.setOption('psana.events',0)
else:
psana.setOption('psana.events',analNumEvents)
# important: load Translator.H5Output after TimeTool.Analyze so that it sees the
# ndarrays that it puts in the Event
psana.setOption('modules','TimeTool.Analyze Translator.H5Output')
ds = psana.DataSource(dataSource)
events = ds.events()
reportEventInterval = 50
global evrSrc
print "---Analyze---"
for ii, evt in enumerate(events):
evrData = evt.get(psana.EvrData.DataV3, evrSrc)
if evrData is None:
print " no evr data in event %d, could be xtcav data, set stream=0-12 in datasource" % ii
continue
hasDoesntExist = False
for fifoEvent in evrData.fifoEvents():
if fifoEvent.eventCode() == eventCodeWhichDoesntExist:
hasDoesntExist = True
if hasDoesntExist:
print " Unexpected: evr code %d found in fifoList - it should not be present" %eventCodeWhichDoesntExist
if ii % reportEventInterval == 0:
print " event: %d" % ii
######################################
# helper functions for interactive plotting
def findSecondFiducial(sec,fid,timeDs):
'''Takes a time dataset from a translated hdf5 file and finds index with sec/fid
ARGS:
sec - seconds to match
fid - fiucials to match
timeDs - time dataset as loaded by h5py from a translated hdf5 file
RET:
index - the 0-up index of the time dataset which has a matching second and fiducial value,
or None if not found, or more than one record matched
'''
fidMatch = timeDs['fiducials']==fid
secMatch = timeDs['seconds']==sec
matchBoth = fidMatch & secMatch
whereArray = np.where(matchBoth)[0]
if len(whereArray) == 0:
return None
if len(whereArray)>1:
print "warning: sec/fid matches more than one position in h5 time dataset, possible with damaged data/split events. sec=%r fid=%r" % (sec, fid)
return None
return whereArray[0]
def checkIndex(idx,sec,fid,timeDs):
'''if timeDs[idx] matches sec/fid, returns idx. Otherwise searches for match in timeDs
ARGS:
idx - index where we expect match
sec - seconds to match
fid - fiucials to match
timeDs - time dataset as loaded by h5py from a translated hdf5 file
RET:
index - the 0-up index of the given time dataset which has a matching second and fiducial value,
or None if not found, or more than one record matched
'''
fidDs = timeDs['fiducials'][idx]
if fidDs != fid:
return findSecondFiducial(sec, fid, timeDs)
secDs = timeDs['seconds'][idx]
if secDs != sec:
return findSecondFiducial(sec, fid, timeDs)
return idx
def getTimeToolValuesFromH5(h5calibCycle, sec, fid, put_key):
'''Gets the timeTool values from a h5 group for a given event Id.
ARGS:
h5calibCycle - h5group for a calib cycle, as returned by h5py
sec, fid - the seconds/fiducials for the eventId we want TimeTool values for
put_key - the timeTool prefix used for data stored in the hdf5 file
OUT:
a dict with the keys AMPL AMPLNXT FLTPOS FLTPOSFWHM FLTPOS_PS REFAMPL
values will be None or the TimeTool value found
'''
timeToolVars = ['AMPL','AMPLNXT','FLTPOS','FLTPOSFWHM','FLTPOS_PS','REFAMPL']
matchIndex = None
ttVals = {}
for ttVar in timeToolVars:
ttVarGroupPath = 'ndarray_float64_1/noSrc__' + put_key + ':' + ttVar
ttVarGroup = h5calibCycle[ttVarGroupPath]
assert ttVarGroup is not None, "error: path %s doesn't exist from calib cycle" % ttVarGroupPath
ttVarTimeDataSet = ttVarGroup['time']
if matchIndex is None:
matchIndex = findSecondFiducial(sec,fid,ttVarTimeDataSet)
else:
matchIndex = checkIndex(matchIndex,sec,fid,ttVarTimeDataSet)
if matchIndex is None:
ttVals[ttVar]=None
else:
ttVarDataDataSet = ttVarGroup['data']
# access data, each is a 1D array with one element:
ttVals[ttVar]=ttVarDataDataSet[matchIndex][0]
return ttVals
def plotEvent(evt, evtIdx, h5calibCycle, put_key, eventIdx, opalSrc, opalBkg, plot_offset_x):
global sig_roi_x
global sig_roi_y
eventId = evt.get(psana.EventId)
sec = eventId.time()[0]
fid = eventId.fiducials()
ttVals = getTimeToolValuesFromH5(h5calibCycle, sec, fid, put_key)
opal = evt.get(psana.Camera.FrameV1, opalSrc)
if ttVals['FLTPOS'] is None or opal is None:
print "no event matching FLTPOS found in h5 file, or no opal found for in this event (eventIdx=%d)" % evtIdx
else:
print "evtIdx=%4d fltpos=%7.2f ampl=%7.5f nxtampl=%7.5f" % (evtIdx, ttVals['FLTPOS'], ttVals['AMPL'], ttVals['AMPLNXT'])
opalArr = np.array(opal.data16(), np.float)-opalBkg
opalArr[opalArr<0.0]=0.0
plt.figure(1)
plt.clf()
plt.imshow(np.log(1.0+opalArr))
plt.hold(True)
plt.xlim([0,opalArr.shape[1]])
plt.ylim([opalArr.shape[0],0])
plt.plot([ttVals['FLTPOS'] + plot_offset_x, ttVals['FLTPOS'] + plot_offset_x],sig_roi_y, '-r')
# plot the ROI box
plt.plot([sig_roi_x[0], sig_roi_x[1], sig_roi_x[1], sig_roi_x[0], sig_roi_x[0] ],
[sig_roi_y[0], sig_roi_y[0], sig_roi_y[1], sig_roi_y[1], sig_roi_y[0] ], '-w')
plt.title('event %d' % evtIdx)
plt.draw()
res = raw_input("hit enter for next event, +/- n to jump n events, e for ipython, q to quit: ")
res = res.lower().strip()
if res == 'q':
return 'quit'
if res == '':
return int(1)
if res == 'e':
IPython.embed()
return int(1)
return int(res)
#### end helper functions
#########################
def interactivePlots(dataSource,
ttResultFile,
opal_src,
put_key,
sxr_timetool_weights,
plot_offset_x,
numpyRefFile):
assert os.path.exists(ttResultFile), "time tool result file doesn't exist"
assert os.path.exists(numpyRefFile), "numpy background ref doesn't exist"
bkg = np.load(numpyRefFile)
h5 = h5py.File(ttResultFile,'r')
h5Run0 = h5['/Configure:0000/Run:0000']
assert 'CalibCycle:0001' not in h5Run0.keys(), "Did not expect more than one calib cycle in output data. Script will have to be changed"
calibCycle0 = h5Run0['CalibCycle:0000']
plt.ion()
ds = psana.DataSource(dataSource)
opalSrc = None
runs = ds.runs()
for run in runs:
times = run.times()
idx = 0
while idx >=0 and idx < len(times):
evt = run.event(times[idx])
if opalSrc is None:
opalSrc = psana.Source(opal_src)
res = plotEvent(evt,idx,calibCycle0,put_key,idx,opalSrc,bkg,plot_offset_x)
del evt
if isinstance(res,str):
if res == 'quit':
print "exiting due to quit"
sys.exit(0)
else:
print "unkown"
sys.exit(1)
idx += res
if idx < 0:
idx = 0
print "warning: you jumped past the start, reset index to 0"
print "finished run: %r idx=%d" % (run,idx)
|
...
now copy ttanalyze.py above to myrel/ttanalyze.py
now copy ttlib.py to myrel/mypkg/src/ttlib.py
...