From a meeting with Dan Damiani, Jana, Valerio, cpo on Jan 19, 2023.  A zoom recording is here:  https://pswww.slac.stanford.edu/swdoc/tutorials/jungfrau.mp4

KCU1500

For 16M integration into LCLS-II DAQ.  See discussion with Larry here: https://slac.slack.com/archives/C5SEZCQD6/p1709091378312859

Repo with kcu1500 firmware: https://github.com/slaclab/lcls2-udp-pcie-apps/tree/main

Pictures

In the detector lab of a 0.5M module

Larger detectors are made up of several of these smaller modules.

Setup

on psdev: /reg/common/tools/bin/netconfig search *jungfrau* --brief

det-daq:~$ ping det-jungfrau-31
PING det-jungfrau-31.pcdsn (172.21.80.239) 56(84) bytes of data.
^C
--- det-jungfrau-31.pcdsn ping statistics ---
6 packets transmitted, 0 received, 100% packet loss, time 5000ms

det-daq:~$ 
det-daq:~$ /reg/common/tools/bin/netconfig view det-jungfrau-31
name: det-jungfrau-31
subnet: cds-xcs.pcdsn
Ethernet Address: 00:50:c2:46:d8:b3
IP: 172.21.80.239
PC#: 00000
Location: ASC Room 1034
Contact: uid=ddamiani,ou=People,dc=reg,o=slac
Description: Jungfrau module bchip031 control interface (a.k.a the 512k)
Puppet Classes:

det-daq:~$ 


det-daq:~$ /reg/common/tools/bin/netconfig edit det-jungfrau-31 --subnet cds-det.pcdsn
Checking parameters against LDAP database ... 

Please confirm the following operation:
Modify det-jungfrau-31 properties:
Subnet: cds-det.pcdsn
IP address: 172.21.58.73

Do you really want to apply those changes (y/N) ? y
Updating database ...
newsuperior: dc=cds-det.pcdsn,ou=Subnets,dc=reg,o=slac
newrdn: cn=det-jungfrau-31
dn; cn=det-jungfrau-31,dc=cds-xcs.pcdsn,ou=Subnets,dc=reg,o=slac

Edited node cn=det-jungfrau-31,dc=cds-det.pcdsn,ou=Subnets,dc=reg,o=slac in LDAP directory.

Notify network services that the configuration has changed:
Re-running the command on relay psldapsrv as root, you may
be asked to type in your password:
Warning: Permanently added 'psldapsrv' (ED25519) to the list of known hosts.
[sudo] password for cpo: 
Creating new DNS zone file ...  
Executing command '/cds/sw/tools/src/LDAP_Helpers/ldap2zone.py /var/named '134.79 172.21.9' cds-det.pcdsn'
Creating new DHCP config file ...  
Executing command '/cds/sw/tools/src/LDAP_Helpers/ldap2dhcp.py --ldapsrv psldapsrv --basedn ou=Subnets,dc=reg,o=slac --file /etc/dhcp/dhcpd.conf'
Opening /etc/dhcp/dhcpd.conf for writing ...  done
Generating DHCP configuration from ou=Subnets,dc=reg,o=slac ...  done

Restarting services ... done.
Network services are now in sync with the LDAP directory.
Connection to psldapsrv closed.

det-daq:~$ 

det-daq:~$ ping det-jungfrau-31
PING det-jungfrau-31.pcdsn (172.21.58.73) 56(84) bytes of data.
64 bytes from 172.21.58.73 (172.21.58.73): icmp_seq=1 ttl=64 time=0.449 ms
64 bytes from 172.21.58.73 (172.21.58.73): icmp_seq=2 ttl=64 time=0.315 ms
^C
--- det-jungfrau-31.pcdsn ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.315/0.382/0.449/0.067 ms
det-daq:~$ 

Notes

  • each jungfrau runs linux (blackfin?)
  • unconfigured current draw is 0.8A, configured current draw is 3A for a 0.5M
  • evr trigger goes into "trigger in" lemo input on back of camera
  • ethernet goes to CDS subnet in DAQ lab (a couple of small switches lying around have this subnet)
  • has both software and firmware
  • ethernet goes to CPU
  • fiber goes to FPGA
  • data streams out of FPGA via UDP (we need jumbo frames, but not pause frames, might need other ethtools settings, we think intel/myricom nic settings are the good at the defaults)
  • the FEZ interface that receives the EVR multicasts (and transmits the bulk data to the DSS nodes) needs interrupt coalescing (may depend on the nic).  maybe 75us for myricom and 50us for the intel nic?  the coalescing matters on nodes reading out large detectors, or large numbers of small detectors
  • jungfrau segment level
    • event builds multiple modules 
    • stitches together 8 asics from each 0.5M in a natural order
  • sometimes need to update software/firmware
  • can update firmware via CPU, but if that fails have to jtag, but it's altera.  altera is installed on machine daq-det-standalone (had to do this more with the early jungfrau's)
  • each 0.5M silicon has a serial number that can't be read out via the DAQ
    • these serial numbers are used by mikhail to lookup calibration constants, but must be managed administratively
  • use the fiber farthest away from the ethernet
    • most module fibers are "green" when locked, but one of them is red unfortunately
    • in daq lab fiber goes to daq-det-jungfrau (also used for epix's)
  • second fiber is used for high-rate bonding which isn't supported yet
  • on det-daq-jungfrau machine fez is 172.21.59.53
  • the serial number of the module (used by Mikhail's calibration lookup) is the mac address of the module (Dan reads these from the modules)
  • the detector fiber is hooked up to one of the other interfaces
    • use ethtool to look for link locked (ethtool enp5s0)

Standalone Tools

  • standalone executable to read out jungfrau's (same code as segment level)
    • see also Running DAQ devices standalone 
    • want to run as detopr because executable creates temporary files then there can be permissions problems cleaning them up to allow a future instance to run
    • executables live in /reg/g/pcds/dist/pds/current/build/pdsapp/bin (no env setup necessary)
    • /reg/g/pcds/dist/pds/current/build/pdsapp/bin/x86_64-rhel7-opt/jungfrauStandAlone -P 32410 -H 10.1.1.105 -m 00:60:dd:45:66:df -d 10.1.1.55 -s det-jungfrau-31 (use -h for help)
      • -P is port number on the host side (-P and -H are a pair)
      • -H is the host IP address of the fiber interface
      • -m is the mac of the host fiber interface
      • -d the ip address of the detector
      • -s hostname of the detector control interface (what's in netconfig)
    • look at cnf file to get params: "grep jungfrau-31 /cds/group/pcds/dist/pds/det/scripts/det.cnf"
    • if this works, camera is working. daq config can be messed up.
    • pds/jungfrau/DataFormat.hh shows the structure of the UDP packet.  framenumber is trigger number, and packet number goes from 0-127.

Generate one frame in standalone (some free-running trigger):

# configuration in standalone mode is hardwired in the code (same as jungfrau default config values)
daq-det-jungfrau:~> /reg/g/pcds/dist/pds/current/build/pdsapp/bin/x86_64-rhel7-opt/jungfrauStandAlone -P 32410 -H 10.1.1.105 -m 00:60:dd:45:66:df -d 10.1.1.55 -s det-jungfrau-31
Shared memory created /slsDetectorPackage_multi_0 
Shared memory created /slsDetectorPackage_multi_0_sls_0 
detector udp_rx interface appears to be unset
setting up detector udp_rx interface
cmd_put rx_udpport: 32410
cmd_put rx_udpip: 10.1.1.105
cmd_put rx_udpmac: 00:60:dd:45:66:df
cmd_put detectorip: 10.1.1.55
cmd_put detectormac: 00:aa:bb:cc:dd:ee
cmd_put configuremac: 0
detector udp_rx interface is up
Configuring 1 modules
checking status of module 0
reg_gett 0x5e: 0x0
module chips need to be powered on
configuring dacs of module 0
Setting Dacs:
setting vb_ds to 1000
cmd_put dac:5: 1000
setting vb_comp to 1220
cmd_put dac:0: 1220
setting vb_pixbuf to 750
cmd_put dac:4: 750
setting vref_ds to 480
cmd_put dac:6: 480
setting vref_comp to 420
cmd_put dac:7: 420
setting vref_prech to 1450
cmd_put dac:3: 1450
setting vin_com to 1053
cmd_put dac:2: 1053
setting vdd_prot to 3000
cmd_put dac:1: 3000
configuring adc of module 0
powering on the chip
Detector returned error: Writing to register 0x5e failed: wrote 0x1 but read 0x3

Write to register failed 
reg_put 0x5e - 0x1: 0x0
resetting the adc
adc_put 0x8 - 0x3: 0xffffffff
adc_put 0x8 - 0: 0xffffffff
adc_put 0x14 - 0x40: 0xffffffff
adc_put 0x4 - 0xf: 0xffffffff
adc_put 0x5 - 0x3f: 0xffffffff
adc_put 0x18 - 0x2: 0xffffffff
reg_put 0x43 - 0x453b2a9c: 0x453b2a9c
configuring clock speed of module 0
setting detector to half speed
cmd_put clkdivider: 1
configuring acquistion settings of module 0
reseting run control ... done
setting trigger delay to 0.000238
configuring for free run
reg_put 0x4e - 0: 0x0
cmd_put cycles: 1
cmd_put frames: 1
cmd_put period: 0.200000000
setting exposure time to 0.000010 seconds
cmd_put exptime: 0.000010000
configuring gain and bias of module 0
setting bias voltage to 200 volts
cmd_put vhighvoltage: 200
setting gain mode 0
clearbit 0x5d - 0: 0xf00
clearbit 0x5d - 1: 0xf00
clearbit 0x5d - 2: 0xf00
clearbit 0x5d - 12: 0xf00
clearbit 0x5d - 13: 0xf00
starting detector: idle
got frame: 1
stopping detector: idle
daq-det-jungfrau:~> 

# reprogram FPGA.  they usually send us a file to reprogram.  here we can brick it, need to recover with altera jtag as described above
./sls_detector_put programfpga <filename>

# altera jtag files (*.pof) and program live here (bought out by intel)
# intelFPGA_lite is the executable for jtagging
daq-det-standalone:~$ ls /opt/jungfrau_firmware/
intelFPGA_lite       Jungfrau_MCB_v0.6.pof  readme.txt
Jungfrau_MCB.rawbin  Jungfrau_MCB_v0.7.pof  setup.sh
daq-det-standalone:~$ 

# reprogram software files are in slsDetectorsPackage-4.1.0/serverBin/jungfrauDetectorServerv4.0.2.0
# have a tftp server on daq-det-standalone, put files here:
daq-det-standalone:~$ ls /var/lib/tftpboot
jungfrauDetectorServerv3.0.0.6.3  jungfrauDetectorServerv3.1.3.0  powerctrl
jungfrauDetectorServerv3.1.1.0    jungfrauDetectorServerv4.0.2.0
daq-det-standalone:~$ 

# similarly, this program runs on the power supply for the 4M (only!) with an embedded linux system program "powerctrl on daq-det-standalone: (don't use softlinks for this, could move it to a "backup" version)
/var/lib/tftpboot/powerctrl

telnet det-jungfrau-31 (gives us a prompt on the camera)
tftp daq-det-standalone get jungfrauDetectorServerv3.1.3.0
# there is a symlink pointing to the current version.  change the symlink using "ln"
# reboot device by typing "reboot" or power cycling. can't brick the device/os by messing this file up.

Running the DAQ

In the detector lab:

  • need to hook the right EVR trigger from daq-det-portable2
  • "ssh det-daq -l detopr"
  • restartdaq (uses det.cnf)
  • log files are in /reg/g/pcds/pds/det/logfiles/2023/01/

Troubleshooting

  • If no jungfrau triggers seen (or intermittent) and using a rhode-schwarz supply (used for 0.5M and 1M, but not 4M) often need to hook up an extra "chassis ground" 
  • if data fibers are swapped (on 1M or 4M) then the IP addresses don't to the NICs.  symptom: everything will configure, but won't see any data: only fixups because L1Accepts will time out, or in the standalone executable will wait forever for a frame (unless one programs a timeout).  on 1M try swapping fibers.  on 4M can power off modules individually and watch with ethtool that the correct one goes off (can't power off individually on the 1M, but can unplug fibers to create the same effect).  Could also mess up with 0.5M if you plug into the wrong NIC.
  • lemo trigger input doesn't go into "trigger in".  symptom: no data on triggers.  will work in free-run mode with the standalone executable in this case
  • daq doesn't configure early in the config process.  causes:
    • not plugging in all the ethernet cables
    • need to re-ip the modules with netconfig
  • daq doesn't configure late in the config process.  cause:
    • current limit set too low on the power supply
    • even though steady state current draw for one module is 3A, need to set limit to 5A in order to handle config.  so for a 1M, limit must be 10A.
  • watch out for too-small power supply cable gauge (e.g. a new cable) since voltage drop over small wires can be an issue
  • geometry files must be deployed in the expt calib dir for ami1 to work
  • check edm screens to see if detector is still on (watch for trips)
  • detector starts damaging with "lost sync" message: (checking the time between triggers from camera and seeing that it matches the evr timestamps) typically reconfigure, but could be a triggering issue (see first troubleshooting point)
  • can also have dropped triggers if the exposure time is set to longer than the trigger period
  • if detector doesn't configure: could check obscure expert config settings against previous versions

4M Idiosyncracies

  • numbers shown for each 0.5M above are the serial numbers
  • read out on two nodes because too much data for one (daq-cxi-jungfrau01 and 02)
  • 40Gb nic in each machine, setup at 4 10Gb interfaces.  MPO cables get broken out very near the detector into LC.
  • 4 segment level processes (2 per node, one per quad) to allow more cpu parallelization
    • "CxiDs1/0/Jungfrau/0" -S 0,2,8  flags in .cnf show the "parent" detector id and the modules of this on (-S 0,2 means 0,1 and -S 2,2 means 2,3)
    • intercepted in the DSS nodes which puts together the 4 pieces into one CxiDs1/0/Jungfrau/0 detector in the final .cnf
    • pdsapp/tools/JungfrauSegBuilder.cc does this.  Included in Recorder.cc.  does something on both configure and l1accept.  FrameCacheIter is holding pieces while before they are memcopied onto the end

Design of KCU1500 Firmware for LCLS2

A conversion with Larry Ruckman on slack on Feb. 29, 2024

Link is here: https://slac.slack.com/archives/C5SEZCQD6/p1709091378312859

  1 day ago

After talking with @ddamiani I think it would be most useful to have a batching-event-builder to join together the udp packets coming in on the various lanes.  We also need the firmware timestamping done in the kcu1500, since we can’t do it at the camera.

20 replies




  1 day ago

  1. 1GbE or 10GbE for each fiber optic lane?
  2. Only 1 UDP port per fiber optic lane or multiple UDP ports?
  1. If multiple, how do you want to address potential
  1. Will the KCU1500 be a UDP server or UDP client?
  2. Does this KCU1500 send fiber triggers?
  3. Does the KCU1500 need to do bi-directional communication to configure sensor(s)? Or is only a "listener" of streaming data?
  1. If not configuration, how is the configuration done?
  1. From this udp packets coming in on the various lanes statement, are we only batching 1 optic lane w/ event building (1 event builder per KCU1500 fiber optic data lane) or need to batch all UDP lanes (up to 6 on the KCU1500) into a single event building (1 event builder per KCU1500)?
  2. To confirm: point-to-point and no ETH switch between the KCU1500 and sensor(s)?
  3. Is there only 1 UDP frame per DAQ trigger per fiber optic lane?
  4. Default IP/MAC addresses and default UDP port that you want the KCU1500 to be for receiving data?
  5. What's the name of the sensor generating the data? I want to match the Github repo name with it.
  6. What's the max. number of UDP lanes that this KCU1500 need to support?
  7. LCLS-I timing only, LCLS-II only, or both?
  1. If LCLS-I timing only (max. 120 Hz triggering), why not do this in software with a COTS NIC card in the same PC as the TPR?


(edited)
  1 day ago

Those are all good questions @ruckman.  I will talk with @ddamiani and get back to you with answers today.
  1 day ago

each lane is 10Gbe
  1 day ago

one port per lane
  1 day ago

3. just receives packets
  1 day ago

4. no the detector is triggered by ttl from a tpr
  1 day ago

5. kcu is only a listener. The configuration of the detector is done over a separate 1GbE copper interface
  1 day ago

7. no switch inbetween
  1 day ago

8. 128 frames per DAQ trigger - these are the packets that need to collected together to make up the detector data
  1 day ago

Chris, each lane is a separate module so each lane can be treated more less separately what buidling in the kcu do we need to do across lanes?
  1 day ago

Thank you Dan, that’s very useful.  Given that, it feels like we need a batching-event-builder on the kcu1500 that event-builds the udp packets AND a timing packet (we will plug Matt’s timing fiber into the KCU).  The batching event-builder in this case is unusual:  we need 128 udp packets per trigger.  Maybe we’ll need to discuss what is best for that?  128 could be “hardcoded”, I think. (edited) 
  1 day ago

9. Dan said that you can set mac address and ip address in the kcu1500 to whatever you want.  He can program the camera to send to “anything”.  He can also program the camera-side mac/ip to anything that would help you.
  1 day ago

10. “Jungfrau”
  1 day ago

11. We think we would like to have 7 UDP lanes and 1 timing lane.
  1 day ago

12, Only LCLS-II timing.
  1 day ago

I think that answers all your (very useful) questions.  Let us know if more questions arise.
  24 hours ago

13) The bandwidth of the KCU1500 is ~48Gb/s for moving data on the PCIe bus.  If you have 7 UDP lanes into a single KCU1500, that should be more bandwidth than PCIe bus can move potentially.  What's your mitigation strategy?
14) How do you plan to assert back pressure from the KCU1500 to the TRP for stopping DAQ triggers?
15) I don't think the FW event batcher will make timing if we have 128 different UDP frames routed to it.  Can I use a different batcher to pre-process the 128 UDP frames into a "single" frame that feeds into the batcher that comes the data and timing together? (edited) 
  20 hours ago

15) yes that should be fine
  6 minutes ago

Hi Larry,13) for the foreseeable future the detector trigger rate will be 120Hz.  So 7 lanes give 0.5Mpixel*2bytes/pixel*120*7=840MB/s which should be good.   Some day in the distant future when the trigger rate increases the traffic will be spread out of more UDP fibers, and KCU cards.  Note that the camera currently has 32 UDP fibers, so there would be 4 nodes with 7 fibers, and 1 node with 4 fibers.14) the TPR will subscribe to a DAQ readout group.  The timing link on the KCU card will assert backpressure to the XPM generating the readout group triggers and cause the jungfrau triggers to stop when we cross the usual buffer “high water mark” in the KCU.15) I agree with Dan that your idea is a good one: pre-processing the 128 frames into one feels like a reasonable solution.
  1 minute ago

One other thought: @ddamiani points out that the 128 UDP packets show up in a fixed but unnatural order.  I can think of three options to get the data in a natural order:

  1. have a programmable register that allows us to specify the desired fixed packet order out of the pre-processor
  2. have the firmware “spy” on the UDP packet content to determine the order.  this is encoded in a header, but feels to me like it would be more awkward than (1) for firmware
  3. have software do the sorting


I would (perhaps selfishly) vote for (1).  What do you think?


  • No labels