Data Summary Tool
A lightweight, flexible analysis micro-framework suitable for both production "standardized" data summary analysis and rapidly prototyping specialized research analyses. In the data summary analysis, the program will perform a "standard" analysis on every piece of data encountered in the data files and return a summarizing result. For the specialized research, analysis and results can be more arbitrary.
The dream is also to run online with the shared memory, but that mode is not supported yet.
Usage
The data summary tool is in SVN, and there is a version checked out in the directory below, which can be used by executing the following commands.
% cd /reg/neh/home/justing/my_ana_rel/DataSummary % sit_setup
The output is placed in the users home directory at ~/data-summary/
The code can be checked out and modified with the commands at the end of this page.
The data summary tool can be used in the following three ways:
- locally on a psana node in a single core mode,
- locally on a psana node in a multi-core mode with mpirun,
- or in a batch multi-core mode using the bsub command.
1] % python data-summary-tool.py CXI/cxic0114 111 2] % mpirun -n 6 python data-summary-tool.py CXI/cxic0114 111 3] % bsub -a mympi -n 24 -o mpi.log -q psanaq python data-summary-tool.py CXI/cxic0114 111
There are also other options that can be passed to the launcher.py script:
usage: data-summary-tool.py [-h] [--max-events-per-node MAX_EVENTS] [--plot-vs X_AXES] [--verbose] [--xkcd] [--base-output-dir BASEOUTPUTDIR] exp run positional arguments: exp the experiment, e.g. CXI/cxic0114 run run to process, e.g. 111 optional arguments: -h, --help show this help message and exit --max-events-per-node MAX_EVENTS, -M MAX_EVENTS maximum events to process per node --plot-vs X_AXES, -X X_AXES pass in channels to plot against, can be passed multiple times --verbose, -v verbosity level of logging, default is 4 (INFO), choices are 1-5 (CRITICAL, ERROR, WARNING, INFO, DEBUG), can pass -v multiple times --xkcd, -x use XKCD plot sytle --base-output-dir BASEOUTPUTDIR, -O BASEOUTPUTDIR set output folder for reports
By default the output is placed in the running user's $HOME/data-summary/ directory. If that directory doesn't exist, it is created. This is configurable with the '-O' optional argument.
Get the code
The code has been developed using github.com and the repository can be viewed, forked and commented at from this url: https://github.com/jgarofoli/LCLS-data-summary.
The code is also checked in to SVN. It can be retrieved by following these instructions:
$> ssh psdev $> newrel ana-currrent datasummarytest $> cd datasummarytest $> addpkg DataSummary $> scons $> ssh psana $> cd datasummarytest/arch/x86_64-rhel5-gcc41-opt/python/DataSummary $> python data-summary-tool.py CXI/cxic0114 111 -M 400 $> gnome-open ~/data-summary/cxic0114_run111.latest/report.html
Output
Output is rendered as an html file containing images and text, and a python dictionary containing data and image locations suitable for programmatic consumption.