Page History

Info
Draft, in workThis documentation was started by Mark Arndt, but then he left SLAC, so this is incomplete. See whiteboard notes. See also Sioan's slides google slides (in particular the one titled "The XTC2 data structure" which include hyperlinks to GitHub code)

Table of Contents

Panel

Topics

TBW.

Concepts and Use Cases

TBW.

Provide support for distributed parallel processing by coupling each large raw data file a small metadata "manifest" file containing only offset pointers and filter criteria that allow a parallel processing node to <make XYZ decisions about how to carve up and process the data>

Concepts

TBW.

self-describing data format
- description includes the algorithms/versionnumbers (e.g. "gzip") where necessary that allow the correct software to be instantiated to analyze a piece of data
no dependencies
lightweight (2500 lines of C++)
same data format used in-mem and on-disk
no serialization (copying) step when sending xtc over a network
can be read while data is being written
lcls-ii will write one or more xtc files per detector
det name, det type, det serial number, det "segment" (which piece of a detector we are)
det configuration metadata shows up at the beginning of every xtc file

Xtc Big

Xtc Detector

Data Files

TBW.

Gliffy Diagram


size	300
name	Xtc_per_file_structure

(Note, this is a user-centric description of the data format, focusing on the most typically used parts of the API, and doesn't cover all of the structure and metadata present in the Xtc format. For example, Xtc data records do not begin with a Names block. For more detail, see Xtc Library Reference)

For typical user code written to parse Xtc data, each record/file effectively begins with a set of data Names,

Xtc

Manifest

Small Data Files

Trying to work up a name for what we labeled "small data" on the whiteboard. "Small data" didn't seem like the right description.Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum

These files (generated automatically by DAQ) have the same format as big-data files, but include, at a minimum, the "offsets" on disk of the associated big data. One small-data "event" per big-data "event". Used for parallelization. Can also include other small-data (e.g. diode values) that can be used to "filter" to avoid paying the penalty for fetching the large data. Current tool is "smdwriter" (may change). if you run "pytest psana/psana/tests" the small data files will be in .tmp/smalldata/*.smd.xtc.

Redesigning Xtc for LCLS-II

(LCLS-I lessons learnedMore self-describing: arrays of different types (floats, ints) and values (floats, ints)

Page tree

Versions Compared

Old Version 47

New Version Current

Key

Concepts and Use Cases

Concepts

Xtc Big

Data Files

Xtc

Small Data Files

Redesigning Xtc for LCLS-II