This documentation was started by Mark Arndt, but then he left SLAC, so this is incomplete. See whiteboard notes. See also Sioan's slides google slides (in particular the one titled "The XTC2 data structure" which include hyperlinks to GitHub code)

Concepts and Use Cases

  • self-describing data format
    • description includes the algorithms/versionnumbers (e.g. "gzip") where necessary that allow the correct software to be instantiated to analyze a piece of data
  • no dependencies
  • lightweight (2500 lines of C++)
  • same data format used in-mem and on-disk
  • no serialization (copying) step when sending xtc over a network
  • can be read while data is being written
  • lcls-ii will write one or more xtc files per detector
  • det name, det type, det serial number, det "segment" (which piece of a detector we are)
  • det configuration metadata shows up at the beginning of every xtc file

Xtc Big Data Files

TBW.

Xtc_per_file_structure

(Note, this is a user-centric description of the data format, focusing on the most typically used parts of the API, and doesn't cover all of the structure and metadata present in the Xtc format. For example, Xtc data records do not begin with a Names block. For more detail, see Xtc Library Reference)

For typical user code written to parse Xtc data, each record/file effectively begins with a set of data Names,

Xtc Small Data Files

Trying to work up a name for what we labeled "small data" on the whiteboard. "Small data" didn't seem like the right description.

These files (generated automatically by DAQ) have the same format as big-data files, but include, at a minimum, the "offsets" on disk of the associated big data.  One small-data "event" per big-data "event".  Used for parallelization.   Can also include other small-data (e.g. diode values) that can be used to "filter" to avoid paying the penalty for fetching the large data.  Current tool is "smdwriter" (may change).  if you run "pytest psana/psana/tests" the small data files will be in .tmp/smalldata/*.smd.xtc.

Redesigning Xtc for LCLS-II

More self-describing:  arrays of different types (floats, ints) and values (floats, ints)

 

 

 
  • No labels