Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents

Python framework: pyana

The most pain-free way to access LCLS XTC data files from python is through LCLS's python framework, pyana. It is a non-interactive framework, but to some extent you can work interactively with the data it produces:

  • The XTC Explorer gives you an "interactive" way to configure your analysis.
  • Embedded IPython can give you interactive access to the data at regular intervals throughout your analysis.

The LCLS HDF5 data files can be worked with truely interactively from python, for details see How to access HDF5 data from Python. Be aware, though, that the framework (psana or pyana) does the job of syncronizing event data for you, and the lack of syncronization of arrays in the HDF5 files is the biggest drawback of working on datafiles outside of our framework.

Interactive data analysis with IPython

Interactive access to XTC data

You cannot currently open an IPython or python shell and read in XTC data. Such an interactive framework for LCLS data is currently being explored, but does not exist yet.

What you can do is load data arrays into an interactive ipython session. Since you cannot load the XTC file directly into ipython, you'll need to run pyana or similar to create the arrays first. Or, you also have the option loading arrays from the LCLS HDF5 files, but be aware that when you work with HDF5 files, arrays from different sources (detectors) may not be synchronized. You will need to time-order and/or synchronize them yourself if you want to correlate data from different sources! See e.g. How to access HDF5 data from Python for how to do this.

Benefits of working directly with the XTC files is that they are available immediately, and you can start analyzing before the run is done collecting. The benefits of using the LCLS framework(s) is that each event is easily extracted and you don't have to worry about time-ordering or synchronizing data from different devices.

If you'd like to analyze XTC files with iPython, the options that exist are:

  • ipython in combination with pyana. The XTC Explorer is an example of how this can be done (XtcExplorer#IPython).
    Note that currently there is no way to run pyana from IPython, but you can run a pyana job and launch ipython at the end to play with the plots/arrays.
  • Write your own application, or be patient and wait for our interactive framework solution.

Because of this limitation, this page isn't really what it pretends to be ("How to access XTC data from Python"). But this page is a placeholder, and attempts to explore some of the functionality that we can use with XTC files later. The interactive analysis in (I)Python is the same in the end.

Numpy and HDF5 files

You can store numpy arrays from a pyana job (reads XTC) and store them in simple numpy files or HDF5 files. Here are some examples:

...


import numpy as np

np.save("filename.npy", array)
array = np.load("filename.npy")

np.savetxt("filename.dat", array)
array = loadtxt("filename.dat")

...

This page tries to summarize how to read XTC files interactively

  • How to read XTC files (LCLS's primary data format)?
    Warning
    titleThe short answer

    Not yet possible!

    Info
    titleThe long answer

    Several tools exist to read XTC files sequencially. Currently, to work interactively with data from XTC files you should read it with one of the tools, store the data in memory or file, and work with it interactively with your tool of choice (e.g. IPython).
    Existing tools:

    • psana framework (C++)
    • pyana framework (python)
    • xtcreader (C++) / pyxtcreader (python)
    • xtcscanner (python)
    • xtcexplorer (python)
    Tip
    titleComing soon!

    We are currently working on better infrastructure for interactive analysis of XTC files. We welcome input from you if you think you may be one of the users of this .

    The rest of this page currently elaborates on The long answer, with a bias towards using python.
  • XTC files can be tranlated to HDF5 format on requst.
    These may allow interactive analysis by outside tools (e.g. Matlab) or python (see How to access HDF5 data from Python). Be aware, though, that the framework (psana or pyana) does the job of syncronizing event data for you, and the lack of syncronization of arrays in the HDF5 files is the biggest drawback of working on datafiles outside of our framework.

The existing tools:

Python framework: pyana

The most pain-free way to access LCLS XTC data files from python is through LCLS's python framework, pyana. It is a non-interactive framework, but to some extent you can work interactively with the data it produces
All about pyana.

C++ framework: psana

The idea is the same as for pyana. Non-interactive. No interactive support as of yet.
All about psana.

If you like GUIs:

The XTC Explorer gives you an "interactive" way to configure your analysis.

If you like python (or IPython):

Python/IPython can be used to analyze data after you've saved them, or they can be embedded into a pyana module to give you interactive access to the data at regular intervals throughout your analysis.
'IPython' (http://ipython.org/Image Added) is an enhanced python shell for interactive use. Many of the examples here would work equally well with a 'regular' python shell.
Plotting is done with 'matplotlib' (http://matplotlib.sourceforge.net/Image Added)
If you're looking for an IDE to work with, consider 'Spyder' (http://code.google.com/p/spyderlib/Image Added).

Interactively exploring the XTC file.

Quick-start way to figure out what's in your xtc file is to run 'xtcscanner' or 'xtcexplorer'. The output can help you write a pyana module for further analysis. The explorer allows you to make some quick plots too.

xtcscanner

This tool also belongs to the XtcExplorer package, and is used by the GUI. But the tool can also be run directly from the command line:

Code Block

usage: xtcscanner [options] xtc-files ...

options:
  -h, --help            show this help message and exit
  -n NDATAGRAMS, --ndatagrams=NDATAGRAMS
  -v, --verbose
  -l L1_OFFSET, --l1-offset=L1_OFFSET

Example:

Code Block
none
none
titlextcscanner -n 200 /reg/d/psdm/AMO/amo01509/xtc/e8-r0094-s0*

Scanning....
Start parsing files:
['/reg/d/psdm/AMO/amo01509/xtc/e8-r0094-s00-c00.xtc', '/reg/d/psdm/AMO/amo01509/xtc/e8-r0094-s01-c00.xtc']
  201 datagrams read in 0.070000 s .   .   .   .   .   .   .
-------------------------------------------------------------
XtcScanner information:
  - 1 calibration cycles.
  - Events per calib cycle:
   [197]

Information from  0  control channels found:
Information from  9  devices found
                      BldInfo:EBeam:             EBeamBld (197)
            BldInfo:FEEGasDetEnergy:             FEEGasDetEnergy (197)
        DetInfo:AmoETof-0|Acqiris-0:  (5 ch)     AcqConfig_V1 (1)   AcqWaveform_V1 (197)
      DetInfo:AmoGasdet-0|Acqiris-0:  (2 ch)     AcqConfig_V1 (1)   AcqWaveform_V1 (197)
        DetInfo:AmoITof-0|Acqiris-0:  (1 ch)     AcqConfig_V1 (1)   AcqWaveform_V1 (197)
        DetInfo:AmoMbes-0|Acqiris-0:  (1 ch)     AcqConfig_V1 (1)   AcqWaveform_V1 (197)
     DetInfo:EpicsArch-0|NoDevice-0:             Epics_V1 (688)
         DetInfo:NoDetector-0|Evr-0:             EvrConfig_V2 (1)
                          ProcInfo::             RunControlConfig_V1 (11)
XtcScanner is done!
-------------------------------------------------------------

The XtcExplorer GUI.

With interactive python embedded, see: https://confluence.slac.stanford.edu/display/PCDS/XTC+Explorer#XTCExplorer-InteractiveplottingwithIPythonImage Added

IPython used "like" MATLAB

Of course MATLAB is much more than this, but here's what we've started with. Here are some examples with IPython based on matlab functions provided by XPP. Thanks to H. Lemke for matlab examples and advice. A python module pymatlab.py defines a number of functions to use in this analysis example.

Starting an interactive session

Code Block
none
none
titleStarting iPython
borderStylesolid

[ofte@psana0XXX myrelease]$ ipython -pylab
Python 2.4.3 (#1, Nov  3 2010, 12:52:40)
Type "copyright", "credits" or "license" for more information.

IPython 0.9.1 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object'. ?object also works, ?? prints more.
Code Block
none
none
titleload the library module
borderStylesolid

In [1]: from pymatlab import *

Generally, it is recommended to load library modules with 'import pymatlab' and access all its methods and classes with pyamatlab.function. In an interactive session it may be easier to have access to the contents of pymatlab in your immediate workspace by doing 'from pymatlab import *'.

Code Block
none
none
titleList the workspace contents ('who' or 'whos')
borderStylesolid

In [2]: who
H5getobjnames   ScanInput       ScanOutput      filtvec findmovingmotor
getSTDMEANfrac_from_startpoint  get_filter get_limits       get_limits_automatic
get_limits_channelhist  get_limits_correlation  get_limits_corrfrac
h5py    np      plt     rdXPPdata       runexpNO2fina       scan       scaninput

In [3]: whos
Variable

...


import h5py

def beginjob(self,evt,env):
    self.ofile = h5py.File("outputfile.hdf5", 'w') # open for writing (overwrites existing file)
    self.shot_counter = 0

def event(self,evt,env)
    # example: store several arrays from one shot in a group labeled with shot (event) number
    self.shot_counter += 1
    group = self.ofile.create_group("Shot%d" % self.shot_counter)

    image1_source = "CxiSc1-0|TM6740-1"
    image2_source = "CxiSc1-0|TM6740-2"

    frame = evt.getFrameValue(image1_source)
    image1 = frame.data()
    frame = evt.getFrameValue(image2_source)
    image2 = frame.data()

    dataset1 = group.create_dataset("%s"%image1_source,data=image1)
    dataset2 = group.create_dataset("%s"%image2_source,data=image2)

def endjob(self,env)
    self.ofile.close()

...

IPython used "like" MATLAB

Of course MATLAB is much more than this, but here's what we've started with. Here are some examples with IPython based on matlab functions provided by XPP. Thanks to H. Lemke for matlab examples and advice. A python module pymatlab.py defines a number of functions to use in this analysis example.

Starting an interactive session

...


[ofte@psana0XXX myrelease]$ ipython -pylab
Python 2.4.3 (#1, Nov  3 2010, 12:52:40)
Type "copyright", "credits" or "license" for more information.

IPython 0.9.1 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object'. ?object also works, ?? prints more.

...


In [1]: from pymatlab import *

Generally, it is recommended to load library modules with 'import pymatlab' and access all its methods and classes with pyamatlab.function. In an interactive session it may be easier to have access to the contents of pymatlab in your immediate workspace by doing 'from pymatlab import *'.

Code Block
nonenone
titleList the workspace contents ('who' or 'whos')
borderStylesolid

In [2]: who
H5getobjnames   ScanInput       ScanOutput      filtvec findmovingmotor
getSTDMEANfrac_from_startpoint  get_filter get_limits       get_limits_automatic
get_limits_channelhist  get_limits_correlation  get_limits_corrfrac
h5py    np      plt     rdXPPdata       runexpNO2fina       scan       scaninput

In [3]: whos
Variable                         Type          Data/Info
--------------------------------------------------------
H5getobjnames                    function      <function H5getobjnames at 0x2b57de8>
ScanInput                        type          <class 'pymatlab.ScanInput'>
ScanOutput                       type          <class 'pymatlab.ScanOutput'>
filtvecType                Data/Info
--------------------------------------------------------
H5getobjnames                    function      <function filtvecH5getobjnames at 0x2b57f50>0x2b57de8>
findmovingmotorScanInput                    function    type  <function findmovingmotor at 0x2b57d70>
getSTDMEANfrac_from_startpoint   function  <class 'pymatlab.ScanInput'>
ScanOutput   <function getSTDMEANfrac_<...>_startpoint at 0x2b581b8>
get_filter                 type      function    <class  <function get_filter at 0x2b57ed8>
get_limits'pymatlab.ScanOutput'>
filtvec                           function      <function get_limitsfiltvec at 0x2b58050>
get_limits_automatic             function      <function get_limits_automatic at 0x2b58230>
get_limits_channelhist           function      <function get_limits_channelhist at 0x2b582a8>
get_limits_correlation           function      <function get_limits_correlation at 0x2b580c8>
get_limits_corrfrac              function      <function get_limits_corrfrac at 0x2b58140>
h5py                             module        <module 'h5py' from '/reg<...>ython/h5py/__init__.pyc'>
np                               module        <module 'numpy' from '/re<...>thon/numpy/__init__.pyc'>
plt                              module        <module 'matplotlib.pyplo<...>n/matplotlib/pyplot.pyc'>
rdXPPdata                        function      <function rdXPPdata at 0x2b57c80>
runexpNO2fina                    function      <function runexpNO2fina at 0x2b57e60>
scan                             ScanOutput    <pymatlab.ScanOutput object at 0x2b60536bee90>
scaninput                        ScanInput     <pymatlab.ScanInput object at 0x2b60536b4e90>

Like in MATLAB, who gives you a short list of workspace contents, whos gives you a longer list of workspace contents.

Plot filtered IPIMB data with limits from graphical input:

Here's a log from a session that produces a loglog plot (blue dots) of two IPIMB channels, selects limits from graphical inpu (mouse click),
draws the selected events with red dots.
Image Removed

...


In [3]: scaninput = ScanInput()

In [4]: scaninput.fina = "/reg/d/psdm/XPP/xpp23410/hdf5/xpp23410-r0107.h5"

In [5]: scan = rdXPPdata(scaninput)
Reading XPP data from  /reg/d/psdm/XPP/xpp23410/hdf5/xpp23410-r0107.h5
Found pv control object  fs2:ramp_angsft_target
Found scan vector  [ 2800120.  2800240.  2800360.  2800480.  2800600.  2800720.  2800840.
  2800960.  2801080.  2801200.  2801320.  2801440.  2801560.  2801680.
  2801800.  2801920.  2802040.  2802160.  2802280.  2802400.  2802520.
  2802640.  2802760.  2802880.  2803000.  2803120.  2803240.  2803360.
  2803480.  2803600.  2803720.  2803840.  2803960.  2804080.  2804200.
  2804320.  2804440.  2804560.  2804680.  2804800.  2804920.  2805040.
  2805160.  2805280.]
Fetching data to correlate with motor
['IPM1', 'IPM2']
(44, 120, 4)

In [6]: channels = np.concatenate(scan.scandata,axis=0)

In [7]: channels.shape
Out[7]: (5280, 4)

In [8]: get_limits(channels,1,"correlation")
4 channels a 5280 events
indexes that pass filter:  (array([   1,    5,    8, ..., 5266, 5272, 5273]),)
Out[8]:
array([[ 0.00086654,  0.01604564],
       [ 0.67172102,  0.71968567],
       [ 0.00194716,  0.01447819],
       [ 0.80365403,  0.73463468]])

In [9]: plt.draw()

Table of comparison (MATLAB vs MatPlotLib)

See also http://www.scipy.org/NumPy_for_Matlab_UsersImage Removed

...

MatLab

...

MatPlotLib

...

Comments

...

Loglog plot of one array vs. another

Code Block
%
%
%
a1 = subplot(121);
loglog(channels(:,1),channels(:,2),'o')
xlabel('CH0')
ylabel('CH1')
a2 = subplot(122);
loglog(channels(:,3),channels(:,4),'o')
xlabel('CH2')
ylabel('CH3')

...

Loglog plot of one array vs. another

Code Block
import matplotlib.pyplot as plt
import numpy as np

a1 = plt.subplot(221)
plt.loglog(channels[:,0],channels[:,1], 'o' )
plt.xlabel('CH0')
plt.ylabel('CH1')
a2 = plt.subplot(222)
plt.loglog(channels[:,2],channels[:,3], 'o' )
plt.xlabel('CH2')
plt.ylabel('CH3')

...

0x2b57f50>
findmovingmotor                  function      <function findmovingmotor at 0x2b57d70>
getSTDMEANfrac_from_startpoint   function      <function getSTDMEANfrac_<...>_startpoint at 0x2b581b8>
get_filter                       function      <function get_filter at 0x2b57ed8>
get_limits                       function      <function get_limits at 0x2b58050>
get_limits_automatic             function      <function get_limits_automatic at 0x2b58230>
get_limits_channelhist           function      <function get_limits_channelhist at 0x2b582a8>
get_limits_correlation           function      <function get_limits_correlation at 0x2b580c8>
get_limits_corrfrac              function      <function get_limits_corrfrac at 0x2b58140>
h5py                             module        <module 'h5py' from '/reg<...>ython/h5py/__init__.pyc'>
np                               module        <module 'numpy' from '/re<...>thon/numpy/__init__.pyc'>
plt                              module        <module 'matplotlib.pyplo<...>n/matplotlib/pyplot.pyc'>
rdXPPdata                        function      <function rdXPPdata at 0x2b57c80>
runexpNO2fina                    function      <function runexpNO2fina at 0x2b57e60>
scan                             ScanOutput    <pymatlab.ScanOutput object at 0x2b60536bee90>
scaninput                        ScanInput     <pymatlab.ScanInput object at 0x2b60536b4e90>

Like in MATLAB, who gives you a short list of workspace contents, whos gives you a longer list of workspace contents.

Plot filtered IPIMB data with limits from graphical input:

Here's a log from a session that produces a loglog plot (blue dots) of two IPIMB channels, selects limits from graphical inpu (mouse click),
draws the selected events with red dots.
Image Added

Code Block
none
none

In [3]: scaninput = ScanInput()

In [4]: scaninput.fina = "/reg/d/psdm/XPP/xpp23410/hdf5/xpp23410-r0107.h5"

In [5]: scan = rdXPPdata(scaninput)
Reading XPP data from  /reg/d/psdm/XPP/xpp23410/hdf5/xpp23410-r0107.h5
Found pv control object  fs2:ramp_angsft_target
Found scan vector  [ 2800120.  2800240.  2800360.  2800480.  2800600.  2800720.  2800840.
  2800960.  2801080.  2801200.  2801320.  2801440.  2801560.  2801680.
  2801800.  2801920.  2802040.  2802160.  2802280.  2802400.  2802520.
  2802640.  2802760.  2802880.  2803000.  2803120.  2803240.  2803360.
  2803480.  2803600.  2803720.  2803840.  2803960.  2804080.  2804200.
  2804320.  2804440.  2804560.  2804680.  2804800.  2804920.  2805040.
  2805160.  2805280.]
Fetching data to correlate with motor
['IPM1', 'IPM2']
(44, 120, 4)

In [6]: channels = np.concatenate(scan.scandata,axis=0)

In [7]: channels.shape
Out[7]: (5280, 4)

In [8]: get_limits(channels,1,"correlation")
4 channels a 5280 events
indexes that pass filter:  (array([   1,    5,    8, ..., 5266, 5272, 5273]),)
Out[8]:
array([[ 0.00086654,  0.01604564],
       [ 0.67172102,  0.71968567],
       [ 0.00194716,  0.01447819],
       [ 0.80365403,  0.73463468]])

In [9]: plt.draw()

Table of comparison (MATLAB vs MatPlotLib)

See also http://www.scipy.org/NumPy_for_Matlab_UsersImage Added

MatLab

MatPlotLib

Comments

Loglog plot of one array vs. another

Code Block
%
%
%
a1 = subplot(121);
loglog(channels(:,1),channels(:,2),'o')
xlabel('CH0')
ylabel('CH1')
a2 = subplot(122);
loglog(channels(:,3),channels(:,4),'o')
xlabel('CH2')
ylabel('CH3')

Loglog plot of one array vs. another

Code Block
import matplotlib.pyplot as plt
import numpy as np

a1 = plt.subplot(221)
plt.loglog(channels[:,0],channels[:,1], 'o' )
plt.xlabel('CH0')
plt.ylabel('CH1')
a2 = plt.subplot(222)
plt.loglog(channels[:,2],channels[:,3], 'o' )
plt.xlabel('CH2')
plt.ylabel('CH3')

channels is a 4xN array of floats, where N is the number of events. Each column corresponds to one out of four Ipimb channels.

Note that the arrays are indexed with 1,2,3,4 in MatLab and 0,1,2,3 in MatPlotLib/NumPy/Python.

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="c55952bf-c93d-4517-8452-0eccb280a103"><ac:plain-text-body><![CDATA[Note also the use of paranthesis, array() in MatLab, array[] in MatPlotLib.

]]></ac:plain-text-body></ac:structured-macro>

test

test

Test

array of limits from graphical input

array of limits from graphical input

 

Code Block
axes(a1)
hold on
lims(1:2,:) = ginput(2);

axes(a2)
hold on
lims(3:4,:) = ginput(2);
Code Block
lims = np.zeros((4,2),dtype="float")

plt.axes(a1)
plt.hold(True)
lims[0:2,:] = plt.ginput(2)

plt.axes(a2)
plt.hold(True)
lims[2:4,:] = plt.ginput(2)

In MatLab, lims is an expandable array that holds limits as set by input from mouse click on the plot (ginput).
NumPy arrays cannot be expanded, so I've declared a 4x2 array of zeros to start with, then fill it with ginput().

 

 

 

filter

filter

 

Code Block

fbool1 = (channels(:,1)>min(lims(1:2,1)))&(channels(:,1)<max(lims(1:2,1)))
fbool2 = (channels(:,2)>min(lims(1:2,2)))&(channels(:,2)<max(lims(1:2,2)));
fbool = fbool1&fbool2
loglog(channels(fbool,1),channels(fbool,2),'or')

fbool3 = (channels(:,3)>min(lims(3:4,3)))&(channels(:,3)<max(lims(3:4,3)))
fbool4 = (channels(:,4)>min(lims(3:4,4)))&(channels(:,4)<max(lims(3:4,4)));
fbool = fbool3&fbool4
loglog(channels(fbool,3),channels(fbool,4),'or') 
Code Block

fbools0 = (channels[:,0]>lims[:,0].min())&(channels[:,0]<lims[:,0].max())
fbools1 = (channels[:,1]>lims[:,1].min())&(channels[:,1]<lims[:,1].max())
fbools = fbools0 & fbools1

fbools2 = (channels[:,2]>lims[:,2].min())&(channels[:,2]<lims[:,2].max())
fbools3 = (channels[:,3]>lims[:,3].min())&(channels[:,3]<lims[:,3].max())
fbools = fbools2&fbools3

Comment

 

 

 

Writing Numpy and HDF5 files from python

You can store numpy arrays from a pyana job (reads XTC) and store them in simple numpy files or HDF5 files. Here are some examples:

Code Block
none
none
titlesaving and loading

import numpy as np

np.save("filename.npy", array)
array = np.load("filename.npy")

np.savetxt("filename.dat", array)
array = loadtxt("filename.dat")

This example shows saving and loading of a binary numpy file (.npy) and an ascii file (.dat).
This only works with single arrays (max 2 dimensions).
If you need to save multiple events/shots in the same file you will need to do some tricks (e.g. flatten the array and stack 1d arrays into 2d arrays where axis2 represent event number). Or you could save as an HDF5 file.

Code Block
none
none
titlesaving simple arrays to HDF5

import h5py

def beginjob(self,evt,env):
    self.ofile = h5py.File("outputfile.hdf5", 'w') # open for writing (overwrites existing file)
    self.shot_counter = 0

def event(self,evt,env)
    # example: store several arrays from one shot in a group labeled with shot (event) number
    self.shot_counter += 1
    group = self.ofile.create_group("Shot%d" % self.shot_counter)

    image1_source = "CxiSc1-0|TM6740-1"
    image2_source = "CxiSc1-0|TM6740-2"

    frame = evt.getFrameValue(image1_source)
    image1 = frame.data()
    frame = evt.getFrameValue(image2_source)
    image2 = frame.data()

    dataset1 = group.create_dataset("%s"%image1_source,data=image1)
    dataset2 = group.create_dataset("%s"%image2_source,data=image2)

def endjob(self,env)
    self.ofile.close()

This example is shown in a pyana setting. The HDF5 file is declared and opened in beginjob, datasets created for each event, and the file is closed in the endjob method.
Or you can group your datasets any other way you find useful, of course.

Saving complex datasets to HDF5 file

Some more advanced examples (courtesy of Hubertus Bromberger):

Code Block

##############
# Create data set
##############
f = h5py.File('test.hdf5', 'w')

f.create_dataset('t-nonames', data = rand(30000), dtype='<f4')

f.create_dataset('t-names', data = np.array(rand(30000), dtype=[('ps', '<f4')]))

dt = np.dtype([
    ('Charge', '<f4'), ('Energy', '<f4'), ('PosX', '<f4'),
    ('PosY', '<f4'), ('AngX', '<f4'), ('AngY', '<f4'),
    ('PkCurrBC2', '<f4')])
f.create_dataset('eBeam-names', data =
        np.array([tuple(i.tolist()) for i in rand(30000, 7)], dtype=dt))

f.create_dataset('eBeam-nonames', data = rand(30000, 7), dtype='<f4')

dt = np.dtype([('a', '<f4'), ('b', '<f4'), ('c', '<f4'), ('d', '<f4', (100,))])
f.create_dataset('dsSubset-names', data =
        np.array([tuple((i[0], i[1], i[2], i[3:].tolist())) for i in rand(30000,103)], dtype=dt))

f.create_dataset('dsSubset-nonames', data = rand(30000,13))

f.close()
Code Block


##############
# Load data and benchmark data access
##############
f = h5py.File('test.hdf5', 'r')

iterations = int(1e4)

#######
# Single col
#######
start = time.time()
for i in xrange(iterations):
    a = f['t-names']['ps']/f['t-names']['ps'].max()
print "Single column as compound dataset: %.2fs" % (time.time() - start)

start = time.time()
for i in xrange(iterations):
    a = f['t-nonames'][:]/f['t-nonames'][:].max()
print "Single column as dataset: %.2fs" % (time.time() - start)

start = time.time()
a = f['t-names']['ps']
for i in xrange(iterations):
    b = a/a.max()
print "Single column from compound dataset prior assignment: %.2fs" % (time.time() - start)

start = time.time()
a = f['t-nonames'][:]
for i in xrange(iterations):
    b = a/a.max()
print "Single column dataset and prior assignment: %.2fs\n" % (time.time() - start)

Code Block


#######
# Select single col from 2x2
#######
start = time.time()
for i in xrange(iterations):
    a = f['eBeam-names']['Energy']/f['eBeam-names']['Energy'].max()
print "Single column as compound dataset: %.2fs" % (time.time() - start)

start = time.time()
for i in xrange(iterations):
    a = f['eBeam-nonames'][:,1]/f['eBeam-nonames'][:,1].max()
print "Single column as dataset: %.2fs" % (time.time() - start)

start = time.time()
a = f['eBeam-names']['Energy']
for i in xrange(iterations):
    b = a/a.max()
print "Single column from compound dataset prior assignment: %.2fs" % (time.time() - start)

start = time.time()
a = f['eBeam-nonames'][:,1]
for i in xrange(iterations):
    b = a/a.max()
print "Single column dataset and prior assignment: %.2fs\n" % (time.time() - start)

#######
# Select columns from 2x2
#######
start = time.time()
for i in xrange(iterations/50):
    for row in f['dsSubset-names']['d']:
        a = row/row.max()
print "Columns as compound dataset: %.2fs" % (time.time() - start)

start = time.time()
for i in xrange(iterations/50):
    for row in f['dsSubset-nonames'][:,3:103]:
        a = row/row.max()
print "Columns as dataset '[:,3:103]': %.2fs" % (time.time() - start)
start = time.time()
for i in xrange(iterations/50):
    for row in f['dsSubset-nonames'][:,3:]:
        a = row/row.max()
print "Columns as dataset '[:,3:]': %.2fs" % (time.time() - start)


start = time.time()
a = f['dsSubset-names']['d']
for i in xrange(iterations/50):
    for row in a:
        b = row/row.max()
print "Columns as compound dataset and prior assignment: %.2fs" % (time.time() - start)

start = time.time()
a = f['dsSubset-nonames'][:,3:]
for i in xrange(iterations/50):
    for row in a:
        b = row/row.max()
print "Columns as dataset and prior assignment: %.2fs" % (time.time() - start)

f.close()

...

]]></ac:plain-text-body></ac:structured-macro>

...

test

...

test

...

Test

...

array of limits from graphical input

...

array of limits from graphical input

...

 

...

Code Block
axes(a1)
hold on
lims(1:2,:) = ginput(2);

axes(a2)
hold on
lims(3:4,:) = ginput(2);

...

Code Block
lims = np.zeros((4,2),dtype="float")

plt.axes(a1)
plt.hold(True)
lims[0:2,:] = plt.ginput(2)

plt.axes(a2)
plt.hold(True)
lims[2:4,:] = plt.ginput(2)

...

In MatLab, lims is an expandable array that holds limits as set by input from mouse click on the plot (ginput).
NumPy arrays cannot be expanded, so I've declared a 4x2 array of zeros to start with, then fill it with ginput().

...

 

...

 

...

 

...

filter

...

filter

...

 

...

Code Block

fbool1 = (channels(:,1)>min(lims(1:2,1)))&(channels(:,1)<max(lims(1:2,1)))
fbool2 = (channels(:,2)>min(lims(1:2,2)))&(channels(:,2)<max(lims(1:2,2)));
fbool = fbool1&fbool2
loglog(channels(fbool,1),channels(fbool,2),'or')

fbool3 = (channels(:,3)>min(lims(3:4,3)))&(channels(:,3)<max(lims(3:4,3)))
fbool4 = (channels(:,4)>min(lims(3:4,4)))&(channels(:,4)<max(lims(3:4,4)));
fbool = fbool3&fbool4
loglog(channels(fbool,3),channels(fbool,4),'or') 

...

Code Block

fbools0 = (channels[:,0]>lims[:,0].min())&(channels[:,0]<lims[:,0].max())
fbools1 = (channels[:,1]>lims[:,1].min())&(channels[:,1]<lims[:,1].max())
fbools = fbools0 & fbools1

fbools2 = (channels[:,2]>lims[:,2].min())&(channels[:,2]<lims[:,2].max())
fbools3 = (channels[:,3]>lims[:,3].min())&(channels[:,3]<lims[:,3].max())
fbools = fbools2&fbools3

...

Comment

...

 

...

 

...

 

Data visualization with NumPy (arrays) and MatPlotLib (plots).

This is not meant to be documentation or a tutorial for matplotlib or numpy. Just a place to document stuff that I have a hard time finding explained elsewhere.

Code Block
1none
titleInspecting objects

for attr_name in dir(obj):
    attribute = getattr(obj, attr_name)
    print attr_name, ": ", attribute

...