You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

The XPP large detector is 4Mpx and 25kHz.  From Vincent: Two multimode MPO 24 fiber bundles per megapixel (8 bundles total) to the MM to SM conversion box. This means 8x24=192 fibers (96 pairs) which is 2 full SM-MM conversion box. (Aside: The SparkPix scales similarly with the number of pixels, so it only needs 24 fibers (1x MPO 24)).

For TXI (2Mpx 5kHz) based on the epixhremu study from ric/stefano/valerio/mikhail/mona it feels like there is a good chance we can do it on 20 nodes with CPU (each with ~50 cores).  The XPP detector has 10x that data volume, and scaling to 200 nodes feels too difficult (although technology improvements may reduce that number somewhat, perhaps by 2x?).

Proposed approach: we should try to target GPUs for this to reduce the node count (ideally to 24 nodes, each of which would take 1 LR4 multi-color fiber).  This is ~8GB/s into each node which is difficult with KCU1500 in our existing nodes, but will hopefully be doable in time for XPP.

If using 1 LR4 fiber per node isn't doable (i.e. we need more nodes) we should still use LR4 to reduce fiber-count in the BOS and use a SM-MM box in SRCF to split out the LR4 into more PLR4 fibers.

To-do items:

  • test common-mode speed on gpu (seshu has demonstrated Mikhail’s calibration formula without common-mode will be fast on a GPU)
  • we should meet with XPP scientists to understand what data-reduction algorithms are needed for their hutch
  • if libsz is one of the algorithms, we should understand its performance on gpu
  • benchmark other data-reduction algorithms on gpu
  • consider multiple options for algorithm implementation: cupy, cunumeric, cuda kernels
  • talk to TID engineers about dma'ing the kcu1500 data directly to gpu
  • No labels