Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Initial Discussion

want:
- give the drp a psana-python script
- drive that psana-python script by calling psana_set_dgram(Dgram*) (would replace the file reading)

...

A potential issue: (lower priority) this method of running psana (and shmem) do not have scalable ways of loading the calibration constants: each core will access the database.  Ideally we would fix.

  • needed for epixhr
    • we should make psana (det.calib and det.image) usable in drp
  • happens in the worker cores (right after dma engine, before time-ordered messages for teb/meb):
    • worry about GIL with threads
  • may need easier tools to append to xtc from python (extracted features)
  • configured from configdb?
  • can consider compiling python (e.g. with cython) for speed improvement
    • can gain 20%-30%? (potentially small improvement compared to GIL)
    • cpo: treat as an optimization
  • really depending on mikhail's area-detector xface being able to handle variable number of segments
  • problem: currently we can't make a DataSource
    • Mona can probably implement one
    • problem: How do we scale to large numbers of cores fetching calibration constants? (this is a problem for shmem too)

Thoughts on a DRP DataSource

  • dgrams flow through in memory, one at a time (a little like shmem)
  • the dgrams are in circular buffers (a little like shmem) ("pebble": per-event-buffers-with-boundaries-listed-explicitly)
  • like shared memory, we don't actually free the dgram memory when we're done
  • shmem doesn't work, because shmem drops events if the code can't keep up
  • need a new pattern similar to shmem

Usage of Python

  • (optional, e.g. if we run 'drp -p "myroi.py <arg1> <arg2>"') run in each worker thread (one subprocess per worker):
    • send detector raw-data datagram to psana-python subprocess via a zmq bi-directional pipe (perhaps).  Performance matters
    • receive potentially two datagrams back:
      • datagram with fex data
      • datagram with trigger data (e.g. number of photons) to send to the teb for both the trigger/monitoring decision
    • to save memory the received fex datagram would replace the raw datagram in pebble memory (requires a memory copy)
  • each teb process would have optional python to process the trigger-datagram produced by the worker (the python may not be psana, depending on whether or not we send all the phase 2 transitions to the teb? ... don't remember)
  • we would start with fex and do the trigger later

Possible approach in time-order:

  • standalone C++ with two threads, each thread having a python subprocess communicating with a bidirectional pipe/queue/shmem (whatever is most efficient).  send "hello world" back and forth.
  • embed the above in a drp executable
  • start passing dgrams instead of "hello world"
  • switch to psana-python

An early-ish goal is the TXI 2M epixHR.  ROI doesn't need corrections, but photon finding does.  Can we do photon-finding @25kHz?  Suppose 25kHz distributed over 4 nodes (240 cores total) equivalent of 100Hz, but currently 2M epix's run on one core at 5Hz (what Valerio sees with peakfinder8 in MFX?).  So a factor of 20 discrepancy.

Second Discussion

Oct 14, 2021 with Mikhail, Mona, Valerio

Executive Summary: now prefer option 1 3 below, which feels like the simplest, although not highest performance.

...

- valerio: drp communicate with multiple python processes
- mona:
  o routines to modify dgrams from python
    - look at dgramCreate.pyx
      https://github.com/slac-lcls/lcls2/blob/master/psana/psana/peakFinder/dgramCreate.pyx
      also the test test_py2xtc.py
      goal: receive the raw-dgram, and return a fex-dgram with the raw (how do we do this return?)
      data removed (almost always) and put in fex data.
  o think about calib-fetch scaling (lower-priority)
    - ideally independent of mpi. zmq approach? how do we
      determine "supervisor"?

Json2XTC Option

drpDG -> psana calls createsjson -> drp calls Json2Xtc

- need to make sure json2xtc can separate the Names (configure-only)
  from Shapes/Data (l1accept)
- cpo worries about performance of the extra json step
- cpo votes to try dgramcreate approach(with 75% confidence level)

Where Python Runs

  1. drp fex (mona/valerio/ric psana python)
  2. producing the trigger data (on the drp) (Ric, custom simple data format?)
    1. should this part of the drp fex or a separate subprocess/call?
    2. we can do this with either a python "call" or a subprocess, but "call" can be dangerous because the multi-threaded DRP can get stuck on the GIL: don't do the "call".
  3. analyzing the trigger data (on the teb) (Ric, custom simple data format?)
  4. python in EbReceiver that uses trigger data results to modify the dgram (e.g. ROI) (mona/valerio/ric psana python) 
    1. use the prescale to record the raw/fex? (but isn't this done by drp fex?)

(1) and (2) could be put into the same psana-python process. (4) is a separate python process.

Some Technical Details

  • shmem ownership/cleanup
    • valerio uses sysv ipc instead of posix_ipc (because the conda version of posix_ipc claims to not support message queues, but a pip-installed version does: this should be a solvable problem ... feels like it's not built optimally?) 
    • unlike our shmem there are not physical in /dev/shm
    • for sysv we control the naming of the numeric keys (which are the equivalent of the filenames) so we can avoid permissions issues that way.  currently the numeric key is formed from the thread number and partition number.  Ric suggests perhaps adding the primary XPM number (cpo points out this is indirect, somehow ideally would use "username").  maybe not such a big issue because the username is controlled by the platform number and procmgr.conf.
  • pebble size (both for transition buffers and L1 buffers: the maximum of these two is used for the drp-python shmem)
    • in Ric's new mode the pebble size is determined from the .cnf, or defaults to .service if not specified (used to be the .service) but drp-python can return a dgram larger than the pebble size.  what do we do?
    • we will manage the two bufend's in the drp-python and we will crash if that gets exceeded
    • if we have low-rate large events, could be better to assert damage rather than crash
      • Ric says maybe this is the job of the fex?  cpo says it might be better if we could solve it in one place for all possible fex's.  since truncating the data corrupts it then we have to mark the xtc as corrupt/not-iterable.
      • A downside of not crashing:  people won't realize there's a problem

Calibration Constants Broadcast


serial number: 1.3.5.7  (segments 1,3 run in one drp process, 5,7 run on another)

configure or connect:
- we do the socket setup
beginrun:
- we do the broadcast

three options:

1) one pub per drp process
   o disadvantage: more database fetches (10 or 20 database simultaneous fetches)
   o the identity of pub will be determined by threadnumber==0.  valerio says
     this is available in python
2) one pub per drp node (with multiple drp processes per node)
   o feels a little messy
3) one pub per detector (multiple detectors)
   o requires either that we fetch the constants for the whole detector using
     a subset of the serial number (5,7).  Mikhail says in this case we don't
     get the (1,3) constants
   o exchange serial numbers using the collection mechanism (so everyone would
     know 1,3,5,7)

- ***NEW*** leaning towards (1) since it works now.
- to get a unique port for the pub/sub (allows multiple drp processes on same node)
  two options:
  o use connect_json (heavyweight answer), or
  o use base_port_number+lowest_segment_number.  use zmq ipc's so we don't see
    broadcasts from other nodes (cpo votes for this option).  in this case
    it's a filename not a port number. instead of lowest_segment_number use the
    detector name + segment (e.g. atmopal_0) as the unique ipc name.
- socket setup on configure

configure:
- chris ford points out: configure is already slow and gets redone more often,
  so connect would be better

creation of python process should be on connect or earlier (e.g. startup)?
- do the socket setup here

python "user startup" (determination of which drp-python script the user has chosen):
- ideally should happen on configure since it is a user "configuration"