Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • numbers shown for each 0.5M above are the serial numbers
  • read out on two nodes because too much data for one (daq-cxi-jungfrau01 and 02)
  • 40Gb nic in each machine, setup at 4 10Gb interfaces.  MPO cables get broken out very near the detector into LC.
  • 4 segment level processes (2 per node, one per quad) to allow more cpu parallelization
    • "CxiDs1/0/Jungfrau/0" -S 0,2,8  flags in .cnf show the "parent" detector id and the modules of this on (-S 0,2 means 0,1 and -S 2,2 means 2,3)
    • intercepted in the DSS nodes which puts together the 4 pieces into one CxiDs1/0/Jungfrau/0 detector in the final .cnf
    • pdsapp/tools/JungfrauSegBuilder.cc does this.  Included in Recorder.cc.  does something on both configure and l1accept.  FrameCacheIter is holding pieces while before they are memcopied onto the end

Design of KCU1500 Firmware for LCLS2

A conversion with Larry Ruckman on slack on Feb. 29, 2024

Link is here: https://slac.slack.com/archives/C5SEZCQD6/p1709091378312859

  1 day ago

After talking with @ddamiani I think it would be most useful to have a batching-event-builder to join together the udp packets coming in on the various lanes.  We also need the firmware timestamping done in the kcu1500, since we can’t do it at the camera.

20 replies

...



  1 day ago

  1. 1GbE or 10GbE for each fiber optic lane?
  2. Only 1 UDP port per fiber optic lane or multiple UDP ports?
  1. If multiple, how do you want to address potential
  1. Will the KCU1500 be a UDP server or UDP client?
  2. Does this KCU1500 send fiber triggers?
  3. Does the KCU1500 need to do bi-directional communication to configure sensor(s)? Or is only a "listener" of streaming data?
  1. If not configuration, how is the configuration done?
  1. From this udp packets coming in on the various lanes statement, are we only batching 1 optic lane w/ event building (1 event builder per KCU1500 fiber optic data lane) or need to batch all UDP lanes (up to 6 on the KCU1500) into a single event building (1 event builder per KCU1500)?
  2. To confirm: point-to-point and no ETH switch between the KCU1500 and sensor(s)?
  3. Is there only 1 UDP frame per DAQ trigger per fiber optic lane?
  4. Default IP/MAC addresses and default UDP port that you want the KCU1500 to be for receiving data?
  5. What's the name of the sensor generating the data? I want to match the Github repo name with it.
  6. What's the max. number of UDP lanes that this KCU1500 need to support?
  7. LCLS-I timing only, LCLS-II only, or both?
  1. If LCLS-I timing only (max. 120 Hz triggering), why not do this in software with a COTS NIC card in the same PC as the TPR?


(edited)
  1 day ago

Those are all good questions @ruckman.  I will talk with @ddamiani and get back to you with answers today.
  1 day ago

each lane is 10Gbe
  1 day ago

one port per lane
  1 day ago

3. just receives packets
  1 day ago

4. no the detector is triggered by ttl from a tpr
  1 day ago

5. kcu is only a listener. The configuration of the detector is done over a separate 1GbE copper interface
  1 day ago

7. no switch inbetween
  1 day ago

8. 128 frames per DAQ trigger - these are the packets that need to collected together to make up the detector data
  1 day ago

Chris, each lane is a separate module so each lane can be treated more less separately what buidling in the kcu do we need to do across lanes?
  1 day ago

Thank you Dan, that’s very useful.  Given that, it feels like we need a batching-event-builder on the kcu1500 that event-builds the udp packets AND a timing packet (we will plug Matt’s timing fiber into the KCU).  The batching event-builder in this case is unusual:  we need 128 udp packets per trigger.  Maybe we’ll need to discuss what is best for that?  128 could be “hardcoded”, I think. (edited) 
  1 day ago

9. Dan said that you can set mac address and ip address in the kcu1500 to whatever you want.  He can program the camera to send to “anything”.  He can also program the camera-side mac/ip to anything that would help you.
  1 day ago

10. “Jungfrau”
  1 day ago

11. We think we would like to have 7 UDP lanes and 1 timing lane.
  1 day ago

12, Only LCLS-II timing.
  1 day ago

I think that answers all your (very useful) questions.  Let us know if more questions arise.
  24 hours ago

13) The bandwidth of the KCU1500 is ~48Gb/s for moving data on the PCIe bus.  If you have 7 UDP lanes into a single KCU1500, that should be more bandwidth than PCIe bus can move potentially.  What's your mitigation strategy?
14) How do you plan to assert back pressure from the KCU1500 to the TRP for stopping DAQ triggers?
15) I don't think the FW event batcher will make timing if we have 128 different UDP frames routed to it.  Can I use a different batcher to pre-process the 128 UDP frames into a "single" frame that feeds into the batcher that comes the data and timing together? (edited) 
  20 hours ago

15) yes that should be fine
  6 minutes ago

Hi Larry,13) for the foreseeable future the detector trigger rate will be 120Hz.  So 7 lanes give 0.5Mpixel*2bytes/pixel*120*7=840MB/s which should be good.   Some day in the distant future when the trigger rate increases the traffic will be spread out of more UDP fibers, and KCU cards.  Note that the camera currently has 32 UDP fibers, so there would be 4 nodes with 7 fibers, and 1 node with 4 fibers.14) the TPR will subscribe to a DAQ readout group.  The timing link on the KCU card will assert backpressure to the XPM generating the readout group triggers and cause the jungfrau triggers to stop when we cross the usual buffer “high water mark” in the KCU.15) I agree with Dan that your idea is a good one: pre-processing the 128 frames into one feels like a reasonable solution.
  1 minute ago

One other thought: @ddamiani points out that the 128 UDP packets show up in a fixed but unnatural order.  I can think of three options to get the data in a natural order:

  1. have a programmable register that allows us to specify the desired fixed packet order out of the pre-processor
  2. have the firmware “spy” on the UDP packet content to determine the order.  this is encoded in a header, but feels to me like it would be more awkward than (1) for firmware
  3. have software do the sorting


I would (perhaps selfishly) vote for (1).  What do you think?