From a meeting with Dan Damiani, Jana, Valerio, cpo on Jan 19, 2023. A zoom recording is here: https://pswww.slac.stanford.edu/swdoc/tutorials/jungfrau.mp4
KCU1500
For 16M integration into LCLS-II DAQ. See discussion with Larry here: https://slac.slack.com/archives/C5SEZCQD6/p1709091378312859
Repo with kcu1500 firmware: https://github.com/slaclab/lcls2-udp-pcie-apps/tree/main
Pictures
In the detector lab of a 0.5M module
Larger detectors are made up of several of these smaller modules.
Setup
on psdev: /reg/common/tools/bin/netconfig search *jungfrau* --brief
det-daq:~$ ping det-jungfrau-31
PING det-jungfrau-31.pcdsn (172.21.80.239) 56(84) bytes of data.
^C
--- det-jungfrau-31.pcdsn ping statistics ---
6 packets transmitted, 0 received, 100% packet loss, time 5000ms
det-daq:~$
det-daq:~$ /reg/common/tools/bin/netconfig view det-jungfrau-31
name: det-jungfrau-31
subnet: cds-xcs.pcdsn
Ethernet Address: 00:50:c2:46:d8:b3
IP: 172.21.80.239
PC#: 00000
Location: ASC Room 1034
Contact: uid=ddamiani,ou=People,dc=reg,o=slac
Description: Jungfrau module bchip031 control interface (a.k.a the 512k)
Puppet Classes:
det-daq:~$
det-daq:~$ /reg/common/tools/bin/netconfig edit det-jungfrau-31 --subnet cds-det.pcdsn
Checking parameters against LDAP database ...
Please confirm the following operation:
Modify det-jungfrau-31 properties:
Subnet: cds-det.pcdsn
IP address: 172.21.58.73
Do you really want to apply those changes (y/N) ? y
Updating database ...
newsuperior: dc=cds-det.pcdsn,ou=Subnets,dc=reg,o=slac
newrdn: cn=det-jungfrau-31
dn; cn=det-jungfrau-31,dc=cds-xcs.pcdsn,ou=Subnets,dc=reg,o=slac
Edited node cn=det-jungfrau-31,dc=cds-det.pcdsn,ou=Subnets,dc=reg,o=slac in LDAP directory.
Notify network services that the configuration has changed:
Re-running the command on relay psldapsrv as root, you may
be asked to type in your password:
Warning: Permanently added 'psldapsrv' (ED25519) to the list of known hosts.
[sudo] password for cpo:
Creating new DNS zone file ...
Executing command '/cds/sw/tools/src/LDAP_Helpers/ldap2zone.py /var/named '134.79 172.21.9' cds-det.pcdsn'
Creating new DHCP config file ...
Executing command '/cds/sw/tools/src/LDAP_Helpers/ldap2dhcp.py --ldapsrv psldapsrv --basedn ou=Subnets,dc=reg,o=slac --file /etc/dhcp/dhcpd.conf'
Opening /etc/dhcp/dhcpd.conf for writing ... done
Generating DHCP configuration from ou=Subnets,dc=reg,o=slac ... done
Restarting services ... done.
Network services are now in sync with the LDAP directory.
Connection to psldapsrv closed.
det-daq:~$
det-daq:~$ ping det-jungfrau-31
PING det-jungfrau-31.pcdsn (172.21.58.73) 56(84) bytes of data.
64 bytes from 172.21.58.73 (172.21.58.73): icmp_seq=1 ttl=64 time=0.449 ms
64 bytes from 172.21.58.73 (172.21.58.73): icmp_seq=2 ttl=64 time=0.315 ms
^C
--- det-jungfrau-31.pcdsn ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.315/0.382/0.449/0.067 ms
det-daq:~$
Notes
- each jungfrau runs linux (blackfin?)
- unconfigured current draw is 0.8A, configured current draw is 3A for a 0.5M
- evr trigger goes into "trigger in" lemo input on back of camera
- ethernet goes to CDS subnet in DAQ lab (a couple of small switches lying around have this subnet)
- has both software and firmware
- ethernet goes to CPU
- fiber goes to FPGA
- data streams out of FPGA via UDP (we need jumbo frames, but not pause frames, might need other ethtools settings, we think intel/myricom nic settings are the good at the defaults)
- the FEZ interface that receives the EVR multicasts (and transmits the bulk data to the DSS nodes) needs interrupt coalescing (may depend on the nic). maybe 75us for myricom and 50us for the intel nic? the coalescing matters on nodes reading out large detectors, or large numbers of small detectors
- jungfrau segment level
- event builds multiple modules
- stitches together 8 asics from each 0.5M in a natural order
- sometimes need to update software/firmware
- can update firmware via CPU, but if that fails have to jtag, but it's altera. altera is installed on machine daq-det-standalone (had to do this more with the early jungfrau's)
- each 0.5M silicon has a serial number that can't be read out via the DAQ
- these serial numbers are used by mikhail to lookup calibration constants, but must be managed administratively
- use the fiber farthest away from the ethernet
- most module fibers are "green" when locked, but one of them is red unfortunately
- in daq lab fiber goes to daq-det-jungfrau (also used for epix's)
- second fiber is used for high-rate bonding which isn't supported yet
- on det-daq-jungfrau machine fez is 172.21.59.53
- the serial number of the module (used by Mikhail's calibration lookup) is the mac address of the module (Dan reads these from the modules)
- the detector fiber is hooked up to one of the other interfaces
- use ethtool to look for link locked (ethtool enp5s0)
Standalone Tools
- standalone executable to read out jungfrau's (same code as segment level)
- see also Running DAQ devices standalone
- want to run as detopr because executable creates temporary files then there can be permissions problems cleaning them up to allow a future instance to run
- executables live in /reg/g/pcds/dist/pds/current/build/pdsapp/bin (no env setup necessary)
- /reg/g/pcds/dist/pds/current/build/pdsapp/bin/x86_64-rhel7-opt/jungfrauStandAlone -P 32410 -H 10.1.1.105 -m 00:60:dd:45:66:df -d 10.1.1.55 -s det-jungfrau-31 (use -h for help)
- -P is port number on the host side (-P and -H are a pair)
- -H is the host IP address of the fiber interface
- -m is the mac of the host fiber interface
- -d the ip address of the detector
- -s hostname of the detector control interface (what's in netconfig)
- look at cnf file to get params: "grep jungfrau-31 /cds/group/pcds/dist/pds/det/scripts/det.cnf"
- if this works, camera is working. daq config can be messed up.
- pds/jungfrau/DataFormat.hh shows the structure of the UDP packet. framenumber is trigger number, and packet number goes from 0-127.
Generate one frame in standalone (some free-running trigger):
# configuration in standalone mode is hardwired in the code (same as jungfrau default config values) daq-det-jungfrau:~> /reg/g/pcds/dist/pds/current/build/pdsapp/bin/x86_64-rhel7-opt/jungfrauStandAlone -P 32410 -H 10.1.1.105 -m 00:60:dd:45:66:df -d 10.1.1.55 -s det-jungfrau-31 Shared memory created /slsDetectorPackage_multi_0 Shared memory created /slsDetectorPackage_multi_0_sls_0 detector udp_rx interface appears to be unset setting up detector udp_rx interface cmd_put rx_udpport: 32410 cmd_put rx_udpip: 10.1.1.105 cmd_put rx_udpmac: 00:60:dd:45:66:df cmd_put detectorip: 10.1.1.55 cmd_put detectormac: 00:aa:bb:cc:dd:ee cmd_put configuremac: 0 detector udp_rx interface is up Configuring 1 modules checking status of module 0 reg_gett 0x5e: 0x0 module chips need to be powered on configuring dacs of module 0 Setting Dacs: setting vb_ds to 1000 cmd_put dac:5: 1000 setting vb_comp to 1220 cmd_put dac:0: 1220 setting vb_pixbuf to 750 cmd_put dac:4: 750 setting vref_ds to 480 cmd_put dac:6: 480 setting vref_comp to 420 cmd_put dac:7: 420 setting vref_prech to 1450 cmd_put dac:3: 1450 setting vin_com to 1053 cmd_put dac:2: 1053 setting vdd_prot to 3000 cmd_put dac:1: 3000 configuring adc of module 0 powering on the chip Detector returned error: Writing to register 0x5e failed: wrote 0x1 but read 0x3 Write to register failed reg_put 0x5e - 0x1: 0x0 resetting the adc adc_put 0x8 - 0x3: 0xffffffff adc_put 0x8 - 0: 0xffffffff adc_put 0x14 - 0x40: 0xffffffff adc_put 0x4 - 0xf: 0xffffffff adc_put 0x5 - 0x3f: 0xffffffff adc_put 0x18 - 0x2: 0xffffffff reg_put 0x43 - 0x453b2a9c: 0x453b2a9c configuring clock speed of module 0 setting detector to half speed cmd_put clkdivider: 1 configuring acquistion settings of module 0 reseting run control ... done setting trigger delay to 0.000238 configuring for free run reg_put 0x4e - 0: 0x0 cmd_put cycles: 1 cmd_put frames: 1 cmd_put period: 0.200000000 setting exposure time to 0.000010 seconds cmd_put exptime: 0.000010000 configuring gain and bias of module 0 setting bias voltage to 200 volts cmd_put vhighvoltage: 200 setting gain mode 0 clearbit 0x5d - 0: 0xf00 clearbit 0x5d - 1: 0xf00 clearbit 0x5d - 2: 0xf00 clearbit 0x5d - 12: 0xf00 clearbit 0x5d - 13: 0xf00 starting detector: idle got frame: 1 stopping detector: idle daq-det-jungfrau:~> # reprogram FPGA. they usually send us a file to reprogram. here we can brick it, need to recover with altera jtag as described above ./sls_detector_put programfpga <filename> # altera jtag files (*.pof) and program live here (bought out by intel) # intelFPGA_lite is the executable for jtagging daq-det-standalone:~$ ls /opt/jungfrau_firmware/ intelFPGA_lite Jungfrau_MCB_v0.6.pof readme.txt Jungfrau_MCB.rawbin Jungfrau_MCB_v0.7.pof setup.sh daq-det-standalone:~$ # reprogram software files are in slsDetectorsPackage-4.1.0/serverBin/jungfrauDetectorServerv4.0.2.0 # have a tftp server on daq-det-standalone, put files here: daq-det-standalone:~$ ls /var/lib/tftpboot jungfrauDetectorServerv3.0.0.6.3 jungfrauDetectorServerv3.1.3.0 powerctrl jungfrauDetectorServerv3.1.1.0 jungfrauDetectorServerv4.0.2.0 daq-det-standalone:~$ # similarly, this program runs on the power supply for the 4M (only!) with an embedded linux system program "powerctrl on daq-det-standalone: (don't use softlinks for this, could move it to a "backup" version) /var/lib/tftpboot/powerctrl telnet det-jungfrau-31 (gives us a prompt on the camera) tftp daq-det-standalone get jungfrauDetectorServerv3.1.3.0 # there is a symlink pointing to the current version. change the symlink using "ln" # reboot device by typing "reboot" or power cycling. can't brick the device/os by messing this file up.
Running the DAQ
In the detector lab:
- need to hook the right EVR trigger from daq-det-portable2
- "ssh det-daq -l detopr"
- restartdaq (uses det.cnf)
- log files are in /reg/g/pcds/pds/det/logfiles/2023/01/
Troubleshooting
- If no jungfrau triggers seen (or intermittent) and using a rhode-schwarz supply (used for 0.5M and 1M, but not 4M) often need to hook up an extra "chassis ground"
- if data fibers are swapped (on 1M or 4M) then the IP addresses don't to the NICs. symptom: everything will configure, but won't see any data: only fixups because L1Accepts will time out, or in the standalone executable will wait forever for a frame (unless one programs a timeout). on 1M try swapping fibers. on 4M can power off modules individually and watch with ethtool that the correct one goes off (can't power off individually on the 1M, but can unplug fibers to create the same effect). Could also mess up with 0.5M if you plug into the wrong NIC.
- lemo trigger input doesn't go into "trigger in". symptom: no data on triggers. will work in free-run mode with the standalone executable in this case
- daq doesn't configure early in the config process. causes:
- not plugging in all the ethernet cables
- need to re-ip the modules with netconfig
- daq doesn't configure late in the config process. cause:
- current limit set too low on the power supply
- even though steady state current draw for one module is 3A, need to set limit to 5A in order to handle config. so for a 1M, limit must be 10A.
- watch out for too-small power supply cable gauge (e.g. a new cable) since voltage drop over small wires can be an issue
- geometry files must be deployed in the expt calib dir for ami1 to work
- check edm screens to see if detector is still on (watch for trips)
- detector starts damaging with "lost sync" message: (checking the time between triggers from camera and seeing that it matches the evr timestamps) typically reconfigure, but could be a triggering issue (see first troubleshooting point)
- can also have dropped triggers if the exposure time is set to longer than the trigger period
- if detector doesn't configure: could check obscure expert config settings against previous versions
4M Idiosyncracies
- numbers shown for each 0.5M above are the serial numbers
- read out on two nodes because too much data for one (daq-cxi-jungfrau01 and 02)
- 40Gb nic in each machine, setup at 4 10Gb interfaces. MPO cables get broken out very near the detector into LC.
- 4 segment level processes (2 per node, one per quad) to allow more cpu parallelization
- "CxiDs1/0/Jungfrau/0" -S 0,2,8 flags in .cnf show the "parent" detector id and the modules of this on (-S 0,2 means 0,1 and -S 2,2 means 2,3)
- intercepted in the DSS nodes which puts together the 4 pieces into one CxiDs1/0/Jungfrau/0 detector in the final .cnf
- pdsapp/tools/JungfrauSegBuilder.cc does this. Included in Recorder.cc. does something on both configure and l1accept. FrameCacheIter is holding pieces while before they are memcopied onto the end
Design of KCU1500 Firmware for LCLS2
A conversion with Larry Ruckman on slack on Feb. 29, 2024
Link is here: https://slac.slack.com/archives/C5SEZCQD6/p1709091378312859
After talking with @ddamiani I think it would be most useful to have a batching-event-builder to join together the udp packets coming in on the various lanes. We also need the firmware timestamping done in the kcu1500, since we can’t do it at the camera.
20 replies
- 1GbE or 10GbE for each fiber optic lane?
- Only 1 UDP port per fiber optic lane or multiple UDP ports?
- If multiple, how do you want to address potential
- Will the KCU1500 be a UDP server or UDP client?
- Does this KCU1500 send fiber triggers?
- Does the KCU1500 need to do bi-directional communication to configure sensor(s)? Or is only a "listener" of streaming data?
- If not configuration, how is the configuration done?
- From this
udp packets coming in on the various lanes
statement, are we only batching 1 optic lane w/ event building (1 event builder per KCU1500 fiber optic data lane) or need to batch all UDP lanes (up to 6 on the KCU1500) into a single event building (1 event builder per KCU1500)? - To confirm: point-to-point and no ETH switch between the KCU1500 and sensor(s)?
- Is there only 1 UDP frame per DAQ trigger per fiber optic lane?
- Default IP/MAC addresses and default UDP port that you want the KCU1500 to be for receiving data?
- What's the name of the sensor generating the data? I want to match the Github repo name with it.
- What's the max. number of UDP lanes that this KCU1500 need to support?
- LCLS-I timing only, LCLS-II only, or both?
- If LCLS-I timing only (max. 120 Hz triggering), why not do this in software with a COTS NIC card in the same PC as the TPR?
Those are all good questions @ruckman. I will talk with @ddamiani and get back to you with answers today.
each lane is 10Gbe
one port per lane
3. just receives packets
4. no the detector is triggered by ttl from a tpr
5. kcu is only a listener. The configuration of the detector is done over a separate 1GbE copper interface
7. no switch inbetween
8. 128 frames per DAQ trigger - these are the packets that need to collected together to make up the detector data
Chris, each lane is a separate module so each lane can be treated more less separately what buidling in the kcu do we need to do across lanes?
Thank you Dan, that’s very useful. Given that, it feels like we need a batching-event-builder on the kcu1500 that event-builds the udp packets AND a timing packet (we will plug Matt’s timing fiber into the KCU). The batching event-builder in this case is unusual: we need 128 udp packets per trigger. Maybe we’ll need to discuss what is best for that? 128 could be “hardcoded”, I think.
9. Dan said that you can set mac address and ip address in the kcu1500 to whatever you want. He can program the camera to send to “anything”. He can also program the camera-side mac/ip to anything that would help you.
10. “Jungfrau”
11. We think we would like to have 7 UDP lanes and 1 timing lane.
12, Only LCLS-II timing.
I think that answers all your (very useful) questions. Let us know if more questions arise.
13) The bandwidth of the KCU1500 is ~48Gb/s for moving data on the PCIe bus. If you have 7 UDP lanes into a single KCU1500, that should be more bandwidth than PCIe bus can move potentially. What's your mitigation strategy?
14) How do you plan to assert back pressure from the KCU1500 to the TRP for stopping DAQ triggers?
15) I don't think the FW event batcher will make timing if we have 128 different UDP frames routed to it. Can I use a different batcher to pre-process the 128 UDP frames into a "single" frame that feeds into the batcher that comes the data and timing together?
15) yes that should be fine
Hi Larry,13) for the foreseeable future the detector trigger rate will be 120Hz. So 7 lanes give 0.5Mpixel*2bytes/pixel*120*7=840MB/s which should be good. Some day in the distant future when the trigger rate increases the traffic will be spread out of more UDP fibers, and KCU cards. Note that the camera currently has 32 UDP fibers, so there would be 4 nodes with 7 fibers, and 1 node with 4 fibers.14) the TPR will subscribe to a DAQ readout group. The timing link on the KCU card will assert backpressure to the XPM generating the readout group triggers and cause the jungfrau triggers to stop when we cross the usual buffer “high water mark” in the KCU.15) I agree with Dan that your idea is a good one: pre-processing the 128 frames into one feels like a reasonable solution.
One other thought: @ddamiani points out that the 128 UDP packets show up in a fixed but unnatural order. I can think of three options to get the data in a natural order:
- have a programmable register that allows us to specify the desired fixed packet order out of the pre-processor
- have the firmware “spy” on the UDP packet content to determine the order. this is encoded in a header, but feels to me like it would be more awkward than (1) for firmware
- have software do the sorting
I would (perhaps selfishly) vote for (1). What do you think?