Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. A native-psana approach where the Reduce/Broadcast of image results would be done in bigdata (BD) or server (SRV) nodes
  2. Changes to the current get_data() approach, turning that into a callback inside the psana event loop (allows psana event loop to keep running)
  3. Implementing a low level interface for legion to use ("psiter") providing access to smd and the ability to convert it into bigdata.  Legion would then implement its own parallelization on top of that (Elliott did this previously for an earlier version of psana). This is missing SlowUpdate.
  4. Decoupling psana and mtip with some buffering
  5. ds.analyze(batch_callback) Batch callback - This will allow callback model. We just need to add batch level callback. (Mona/Seema Meeting Apr 14)

What we learned:

  • Reduce is necessary for MTIP (cannot be a looser send/receive pattern)
  • Use of Reduce requires user-control of loops to ensure that users can call it the same number of times on all cores (even at the ragged end-of-run).  This can be done, for example, with a Pull-model (not a callback or Push).  Users may have to explicitly specify how many times they want to Reduce (not elegant, but workable, although not in a live-streaming case where you don't know how long the run is going to be).  A better solution could be to have SMD0 inject step transitions at a regular interval, which get broadcast to all BD cores which could trigger the reduce.
  • Option (1) above feels like it is covered by (2) (work done on BD cores) and (4) (work done on SRV cores)
  • Option (4) would require psana SRV cores to be made available for expert-user-level programming (i.e. not doing usual h5 production pattern)
  • Option (4) could use an "mpiChannel" package that Johannes has worked on with mpi3 one-sided communication.  Note: a few years ago one-sided communication performed poorly in tests cpo did, but Johannes believes it is much better now.

...

Whiteboard pictures of the various options and other thoughts:

Image RemovedImage AddedImage ModifiedImage RemovedImage Added