Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Objective

Currently LCLS does not offer a uniform approach to the analysis of accumulated in experements data. Users exploits myana, pyana, MatLab, IDL, CASS, and probably something else. The work on long-awaited project of psana is in progress, but this is going to be quite generic and probably not so simple for new users approach. In this page we discuss a simple but absolutly flexible approach to analysis of data stored in HDF5 files. It is based on Python code with extensive expluatation of standard libraries. A few examples of how to access and process data are presented at the end of this page.

There are obvious advantages in this approach,

  • this approach is absolutely flexible; HDF5 file has indexed structure, that means direct access to any event data from any file from your code.
  • Python is a high-level scripting language allows to write transparent and compact code based on well-elaborated standard libraries.
  • In general code in Python works slow comparing to C++, but there are libraries like NumPy written on C++, which solve this problem for manipulation with large arrays.

There is a couple of drawbacks in this approach,

  • you have to know or learn Python
  • corrent version of the h5py library works quite slow with long HDF5 files

The first issue about Python is not really a drawback. Basic concept of this high-level language can be learned from scratches for about a couple of days. In a week you will feel yourself as an expert and will enjoy programming on this powerfull language. Second issue about slow h5py library is really anoying, but we hope that authors will account for our comments and its performane can be improved soon.

Below we assume that everything is set up to work on LCLS analysis farm, othervise see Computing and Account Setup.

Libraries

Here is a list of libraries with appropriate references which we are going to use in our examples:

These libraries can be easily imported somewhere around the header of the Python file, for example

Code Block
#!/usr/bin/env python
import h5py
import numpy as np
import scipy as sp
import scipy.ndimage as spi
import matplotlib.pyplot as plt

Basic operations

Let us consider basic operation which you have to code in order to access HDF5 data.

  • Open file, get dataset, get array for current event, and close file:

...