Interactive data analysis with iPython
- 'ipython' (http://ipython.org/
) is an enhanced python shell for interactive use. Many of the examples here would work equally well with a 'regular' python shell.
- plotting is done with 'matplotlib' (http://matplotlib.sourceforge.net/
)
- If you're looking for an IDE to work with, consider 'Spyder' (http://code.google.com/p/spyderlib/
).
Interactive access to XTC data
An interactive framework based on ipython is currently being explored, but does not exist yet.
You have the option of working with HDF5 files in an interactive ipython session. Be aware that when you work with HDF5 files, arrays from different sources (detectors) may not be synchronized. You will need to time-order and/or synchronize them yourself if you want to correlate data from different sources! See e.g. How to access HDF5 data from Python for how to do this.
Benefits of XTC files is that they are available immediately, and you can start analyzing before the run is done collecting. The benefits of using the LCLS framework(s) is that each event is easily extracted and you don't have to worry about time-ordering or synchronizing data from different devices.
If you'd like to analyze XTC files with iPython, the options that exist are:
- ipython in combination with pyana. The XTC Explorer is an example of how this can be done (XtcExplorer#IPython).
Note that currently there is no way to run pyana from IPython, but you can run a pyana job and launch ipython at the end to play with the plots/arrays.
- Write your own application.
Data visualization with NumPy (arrays) and MatPlotLib (plots).
This is not meant to be documentation or a tutorial for matplotlib or numpy. Just a place to document stuff that I have a hard time finding explained elsewhere.
- This is really just simple python. But since 'matplotlib' documentation can be frustratingly non-verbose about what functions and attributes are available for its various classes/objects, I found this is a useful way to inspect what an object knows about itself.Inspecting objects
for attr_name in dir(obj): attribute = getattr(obj, attr_name) print attr_name, ": ", attribute
- This example shows saving and loading of a binary numpy file (.npy) and an ascii file (.dat).saving and loading
import numpy as np np.save("filename.npy", array) array = np.load("filename.npy") np.savetxt("filename.dat", array) array = loadtxt("filename.dat")
This only works with single arrays (max 2 dimensions).
If you need to save multiple events/shots in the same file you will need to do some tricks (e.g. flatten the array and stack 1d arrays into 2d arrays where axis2 represent event number). Or you could save as an HDF5 file.
- This example is shown in a pyana setting. The HDF5 file is declared and opened in beginjob, datasets created for each event, and the file is closed in the endjob method.saving to HDF5
import h5py def beginjob(self,evt,env): self.ofile = h5py.File("outputfile.hdf5", 'w') # open for writing (overwrites existing file) self.shot_counter = 0 def event(self,evt,env) # example: store several arrays from one shot in a group labeled with shot (event) number self.shot_counter += 1 group = self.ofile.create_group("Shot%d" % self.shot_counter) image1_source = "CxiSc1-0|TM6740-1" image2_source = "CxiSc1-0|TM6740-2" frame = evt.getFrameValue(image1_source) image1 = frame.data() frame = evt.getFrameValue(image2_source) image2 = frame.data() dataset1 = group.create_dataset("%s"%image1_source,data=image1) dataset2 = group.create_dataset("%s"%image2_source,data=image2) def endjob(self,env) self.ofile.close()
Or you can group your datasets any other way you find useful, of course.
Examples from MATLAB
These examples are mostly python rewrites of matlab functions provided by XPP. Thanks to H. Lemke for matlab examples and advice.
- The "library": pymatlab.py
... a module implementing in python some of the tools written by Henrik/XPP for matlab. For those familiar with the XPP matlab tools, the functions here should be intuitive to use. Only a few functions have been implemented thus far... (feel free to contribute).
- Starting iPython
[ofte@psana0XXX myrelease]$ ipython Python 2.4.3 (#1, Nov 3 2010, 12:52:40) Type "copyright", "credits" or "license" for more information. IPython 0.9.1 -- An enhanced Interactive Python. ? -> Introduction and overview of IPython's features. %quickref -> Quick reference. help -> Python's own help system. object? -> Details about 'object'. ?object also works, ?? prints more.
Loading the library. Normally 'import pymatlab' would be recommended, but if you do 'from pymatlab import *', all the functions defined in this module gets loaded in the current namespace, and you can see them in your workspace. This might be easier for interactive work.In [1]: from pymatlab import * Pretend this is matlab
who
gives you a short list of workspace contentsIn [2]: who H5getobjnames ScanInput ScanOutput filtvec findmovingmotor getSTDMEANfrac_from_startpoint get_filter get_limits get_limits_automatic get_limits_channelhist get_limits_correlation get_limits_corrfrac h5py np plt rdXPPdata runexpNO2fina scan scaninput
whos
gives you a longer list of workspace contentsIn [3]: whos Variable Type Data/Info -------------------------------------------------------- H5getobjnames function <function H5getobjnames at 0x2b57de8> ScanInput type <class 'pymatlab.ScanInput'> ScanOutput type <class 'pymatlab.ScanOutput'> filtvec function <function filtvec at 0x2b57f50> findmovingmotor function <function findmovingmotor at 0x2b57d70> getSTDMEANfrac_from_startpoint function <function getSTDMEANfrac_<...>_startpoint at 0x2b581b8> get_filter function <function get_filter at 0x2b57ed8> get_limits function <function get_limits at 0x2b58050> get_limits_automatic function <function get_limits_automatic at 0x2b58230> get_limits_channelhist function <function get_limits_channelhist at 0x2b582a8> get_limits_correlation function <function get_limits_correlation at 0x2b580c8> get_limits_corrfrac function <function get_limits_corrfrac at 0x2b58140> h5py module <module 'h5py' from '/reg<...>ython/h5py/__init__.pyc'> np module <module 'numpy' from '/re<...>thon/numpy/__init__.pyc'> plt module <module 'matplotlib.pyplo<...>n/matplotlib/pyplot.pyc'> rdXPPdata function <function rdXPPdata at 0x2b57c80> runexpNO2fina function <function runexpNO2fina at 0x2b57e60> scan ScanOutput <pymatlab.ScanOutput object at 0x2b60536bee90> scaninput ScanInput <pymatlab.ScanInput object at 0x2b60536b4e90>
1) Select limits from graphical input and plot filtered IPIMB
Here's a log from a session that produces a loglog plot (blue dots) of two IPIMB channels, selects limits from graphical inpu (mouse click),
draws the selected events with red dots.
In [1]: from pymatlab import * Pretend this is matlab In [2]: whos Variable Type Data/Info ------------------------------------------------------ H5getobjnames function <function H5getobjnames at 0x1379fde8> ScanInput type <class 'pymatlab.ScanInput'> ScanOutput type <class 'pymatlab.ScanOutput'> filtvec function <function filtvec at 0x1379ff50> findmovingmotor function <function findmovingmotor at 0x1379fd70> getSTDMEANfrac_from_startpoint function <function getSTDMEANfrac_<...>startpoint at 0x137a01b8> get_filter function <function get_filter at 0x1379fed8> get_limits function <function get_limits at 0x137a0050> get_limits_automatic function <function get_limits_automatic at 0x137a0230> get_limits_channelhist function <function get_limits_channelhist at 0x137a02a8> get_limits_correlation function <function get_limits_correlation at 0x137a00c8> get_limits_corrfrac function <function get_limits_corrfrac at 0x137a0140> h5py module <module 'h5py' from '/reg<...>ython/h5py/__init__.pyc'> np module <module 'numpy' from '/re<...>thon/numpy/__init__.pyc'> plt module <module 'matplotlib.pyplo<...>n/matplotlib/pyplot.pyc'> rdXPPdata function <function rdXPPdata at 0x1379fc80> runexpNO2fina function <function runexpNO2fina at 0x1379fe60> In [3]: scaninput = ScanInput() In [4]: scaninput.fina = "/reg/d/psdm/XPP/xpp23410/hdf5/xpp23410-r0107.h5" In [5]: scan = rdXPPdata(scaninput) Reading XPP data from /reg/d/psdm/XPP/xpp23410/hdf5/xpp23410-r0107.h5 Found pv control object fs2:ramp_angsft_target Found scan vector [ 2800120. 2800240. 2800360. 2800480. 2800600. 2800720. 2800840. 2800960. 2801080. 2801200. 2801320. 2801440. 2801560. 2801680. 2801800. 2801920. 2802040. 2802160. 2802280. 2802400. 2802520. 2802640. 2802760. 2802880. 2803000. 2803120. 2803240. 2803360. 2803480. 2803600. 2803720. 2803840. 2803960. 2804080. 2804200. 2804320. 2804440. 2804560. 2804680. 2804800. 2804920. 2805040. 2805160. 2805280.] Fetching data to correlate with motor ['IPM1', 'IPM2'] (44, 120, 4) In [6]: channels = np.concatenate(scan.scandata,axis=0) In [7]: channels.shape Out[7]: (5280, 4) In [8]: get_limits(channels,1,"correlation") 4 channels a 5280 events indexes that pass filter: (array([ 1, 5, 8, ..., 5266, 5272, 5273]),) Out[8]: array([[ 0.00086654, 0.01604564], [ 0.67172102, 0.71968567], [ 0.00194716, 0.01447819], [ 0.80365403, 0.73463468]]) In [9]: plt.draw()
Table of comparison (MATLAB vs MatPlotLib)
MatLab |
MatPlotLib |
Comments |
|
---|---|---|---|
Loglog plot of one array vs. another % % % a1 = subplot(121); loglog(channels(:,1),channels(:,2),'o') xlabel('CH0') ylabel('CH1') a2 = subplot(122); loglog(channels(:,3),channels(:,4),'o') xlabel('CH2') ylabel('CH3') |
Loglog plot of one array vs. another import matplotlib.pyplot as plt import numpy as np a1 = plt.subplot(221) plt.loglog(channels[:,0],channels[:,1], 'o' ) plt.xlabel('CH0') plt.ylabel('CH1') a2 = plt.subplot(222) plt.loglog(channels[:,2],channels[:,3], 'o' ) plt.xlabel('CH2') plt.ylabel('CH3') |
channels is a 4xN array of floats, where N is the number of events. Each column corresponds to one out of four Ipimb channels. |
]]></ac:plain-text-body></ac:structured-macro> |
test |
test |
Test |
|
array of limits from graphical input |
array of limits from graphical input |
|
|
axes(a1) hold on lims(1:2,:) = ginput(2); axes(a2) hold on lims(3:4,:) = ginput(2); |
lims = np.zeros((4,2),dtype="float") plt.axes(a1) plt.hold(True) lims[0:2,:] = plt.ginput(2) plt.axes(a2) plt.hold(True) lims[2:4,:] = plt.ginput(2) |
In MatLab, |
|
|
|
|
|
filter |
filter |
|
|
fbool1 = (channels(:,1)>min(lims(1:2,1)))&(channels(:,1)<max(lims(1:2,1))) fbool2 = (channels(:,2)>min(lims(1:2,2)))&(channels(:,2)<max(lims(1:2,2))); fbool = fbool1&fbool2 loglog(channels(fbool,1),channels(fbool,2),'or') fbool3 = (channels(:,3)>min(lims(3:4,3)))&(channels(:,3)<max(lims(3:4,3))) fbool4 = (channels(:,4)>min(lims(3:4,4)))&(channels(:,4)<max(lims(3:4,4))); fbool = fbool3&fbool4 loglog(channels(fbool,3),channels(fbool,4),'or') |
fbools0 = (channels[:,0]>lims[:,0].min())&(channels[:,0]<lims[:,0].max()) fbools1 = (channels[:,1]>lims[:,1].min())&(channels[:,1]<lims[:,1].max()) fbools = fbools0 & fbools1 fbools2 = (channels[:,2]>lims[:,2].min())&(channels[:,2]<lims[:,2].max()) fbools3 = (channels[:,3]>lims[:,3].min())&(channels[:,3]<lims[:,3].max()) fbools = fbools2&fbools3 |
Comment |
|
|
|
|