Spinifel MTIP psana Integration

Discussion on March 18, 2022 with Monarin Uervirojnangkoorn, Elliott Slaughter, Iris Chang, Johannes Blaschke, Chuck Yoon

Goals:

Legion and MPI support for spinifel/mtip
Do we need to continue callback support in psana for legion?
Is the collective "Reduce" necessary and can it be guaranteed that all cores will be able to call it the same number of times? (e.g. at the ragged end-of-run)

Discussed 4 approaches to do this:

A native-psana approach where the Reduce/Broadcast of image results would be done in bigdata (BD) or server (SRV) nodes
Changes to the current get_data() approach, turning that into a callback inside the psana event loop (allows psana event loop to keep running)
Implementing a low level interface for legion to use ("psiter") providing access to smd and the ability to convert it into bigdata. Legion would then implement its own parallelization on top of that (Elliott did this previously for an earlier version of psana). This is missing SlowUpdate.
Decoupling psana and mtip with some buffering
ds.analyze(batch_callback) Batch callback - This will allow callback model. We just need to add batch level callback. (Mona/Seema Meeting Apr 14)

What we learned:

Reduce is necessary for MTIP (cannot be a looser send/receive pattern)
Use of Reduce requires user-control of loops to ensure that users can call it the same number of times on all cores (even at the ragged end-of-run). This can be done, for example, with a Pull-model (not a callback or Push). Users may have to explicitly specify how many times they want to Reduce (not elegant, but workable, although not in a live-streaming case where you don't know how long the run is going to be). A better solution could be to have SMD0 inject step transitions at a regular interval, which get broadcast to all BD cores which could trigger the reduce.
Option (1) above feels like it is covered by (2) (work done on BD cores) and (4) (work done on SRV cores)
Option (4) would require psana SRV cores to be made available for expert-user-level programming (i.e. not doing usual h5 production pattern)
Option (4) could use an "mpiChannel" package that Johannes has worked on with mpi3 one-sided communication. Note: a few years ago one-sided communication performed poorly in tests cpo did, but Johannes believes it is much better now.

Conclusions/Plan:

Don't consider (1) since it is covered by (2) and (4)
Try (2) because it is easy (although not ideal, e.g. because of hardwired number of Reduce calls)
Try (3) because it is the easiest way to allow legion implementation of mtip (Elliott had already done it for an earlier version of psana). Note that this is separate from the callback-approach that Seems is supporting. So callbacks are still necessary. The need for the two different approaches is not ideal. With this option the slow epics data from SlowUpdate is difficult. That will not be supported for now.
Use (4) as a backup plan if (2) runs into trouble, since (4) is more effort.

Whiteboard pictures of the various options and other thoughts:

Page tree