Current Development Machine Name

lcls-pc83236           # machine with the kcu1500 in 901

lcls-pc88284           # Silke's machine
drp-tst-acc02          # machine with the kcu1500 in the setup lab
rdsrv223                 # machine with usb connection to kcu1500 for jtag flashing and integrated logic analyzer (ila) functionality

Preparing git ssh keys 

Add your ssh keys to git (This is unfortunately necessary because the .gitmodules file in lcls2-pcie-apps uses the "git" form of the URL instead of the "http" form):

https://help.github.com/articles/adding-a-new-ssh-key-to-your-github-account/

Also note that the firmware requires a new version of git that supports "links to large files" (lfs).  Add /afs/slac/g/reseng/git/git/bin to PATH.

 

Rogue Documentation

https://slaclab.github.io/rogue/

Conda Commands to Create Rogue Environment

This is in addition to the other packages that must be built for the rogue library (see next step).  These conda commands are derived from:

https://github.com/slaclab/rogue/blob/master/Readme_python3.txt

conda env create -n timetool
source activate timetool
conda install pyyaml
conda install pyzmq
conda install -c conda-forge parse
conda install click
conda install MySQLdb
conda install -c bioconda mysqlclient
conda install -c conda-forge pyro4
conda install numpy
pip install recordclass


Building Rogue

git clone https://github.com/slaclab/rogue.git

Needs a conda env with a bunch of stuff (previous step).  Follow build instruction files in the rogue root directory README files (although I suggest setting up a conda python3 env (previous step) instead of using pip install):

https://github.com/slaclab/rogue/blob/master/Readme_build.txt (doesn't exist anymore (as of 6/25/2018. use link below instead))

https://github.com/slaclab/rogue/blob/master/README.md

cd rogue

git checkout 44c65dab0f18c4e65dcc1e1aea0060a51457a7a5

git submodule init

git submodule update

cd drivers

git checkout 0278e9f4a477c0e74724c5ea4b3fd0afb18faa66

cd ..

rm -r build

mkdir build

cd build

cmake..

make

To run, source this script:

https://github.com/slaclab/lcls2-pcie-apps/blob/master/software/TimeTool/setup_env_template.csh

Some applications are not built by default.  cd to directory and make.

Building Firmware

Follow instructions in the README.md here (make sure to use the modern AFS version of git described here so you can use git-lfs):

https://github.com/slaclab/lcls2-pcie-apps.git

git submodule init

git submodule update

(old: source /afs/slac/g/reseng/xilinx/vivado_2017.3/Vivado/2017.3/settings64.sh  #this isn't working on 1/31/2018. sourcing below instead)

source lcls2-pcie-apps/firmware/setup_env_slac.sh

if you want to store the output of "make" on your local machine: in the "firmware/" directory, "ln -s /u1/sioan/build ."

cd firmware/targets/TimeToolKcu1500

make

 

Flashing KCU1500 with Firmware

configuration memory part number for kcu1500

mt25qu512-spi-x1_x2_x4_x8

https://docs.google.com/presentation/d/1VVfkIWN9M_czZiaXhK4iFp-Drj_yc64smbzpSwZ61Cg/edit?usp=sharing


Front-end Board (pgp->camlink converter) Firmware

https://github.com/slaclab/cameralink-gateway

For the OFFICIAL BOARD use configuration memory part number s25fl128sxxxxxx0-spi-x1_x2-x4

For the EVAL BOARD front-end board firmware (a xilinx KC705 rev. 1.1) use configuration memory part number 28f00ap30t-bpi-x16

 

Here's the mcs file location

$TOP/cameralink-gateway/firmware/targets/ClinkFebPgp2b_1ch/images/ClinkFebPgp2b_1ch-0x00000025-20190315182526-ruckman-af9cde50.mcs

Making Vivado communicate with board over USB/JTAG

Larry has some slides on how to program the flash chips (mt25qu512) on the KCU1500 via USB/JTAG.  Startup "vivado" after setting up the firmware 

https://docs.google.com/presentation/d/10eIsAbLmslcNk94yV-F1D3hBfxudBf0EFo4xjcn9qPk/edit#slide=id.g245233f915_0_41

Can see if the usb cable works by doing "lsusb" and looking for "Bus 001 Device 005: ID 0403:6010 Future Technology Devices International, Ltd FT2232C/D/H Dual UART/FIFO IC".

Do this to program with flash chips on the KCU1500 for the first time (after the first time, should be able to update over pcie with an updateProm script described below).  Before programming lspci will show:

02:00.0 Serial controller: Xilinx Corporation Device 8638 (rev ff)

After programming powercycle the machine.  Then lspci should show:

02:00.0 Signal processing controller: SLAC National Accelerator Lab PPA-REG Device 2030

#https://www.xilinx.com/support/answers/59128.html

1) Disconnect all Xilinx USB cables from the host computer.
2) Open a shell or terminal console.
3) Extract the driver script and its support files to a local drive of the machine where the cable will be used by typing:

#must cd to directory. Can't run install_drivers from arbitrary directory.

cd /afs/slac.stanford.edu/g/reseng/xilinx/vivado_2017.4/Vivado/2017.4/data/xicom/cable_drivers/lin64/install_script/install_drivers/
sudo ./install_drivers

#now vivado hardware manager will see the kcu1500 board.

NOTE: every time the KCU is programmed via jtag or pcie, the computer must be powercycled.  This is because it loses its pcie "id" numbers (called "pcie enumeration") which are allocated at power-on time.

#instructions to program using vivado hardware manager in link below

https://www.xilinx.com/support/documentation/sw_manuals/xilinx2013_2/ug908-vivado-programming-debugging.pdf

SLAC Driver

Build/install the datadev.ko driver using the instructions here:

https://github.com/slaclab/lcls2-pcie-apps

This needs to be done on the machine where the KCU1500 lives.  You need sudo on the machine to install the driver.

Programming FPGA over PCI

After the first programming (and power-cycling) described above, use this script to reprogram:

https://github.com/slaclab/lcls2-pcie-apps/blob/master/software/TimeTool/scripts/updateProm.py

NOTE: every time the KCU is programmed via jtag or pcie, the computer must be powercycled.  This is because it loses its pcie "id" numbers (called "pcie enumeration") which are allocated at power-on time.

if using $TOP/lcls2-pcie-apps/software/scripts/TimeToolupdateProm.py then need to 

 

sudo insmod $TOP/aes-stream-drivers/data_dev/driver/datadev.ko
chmod 666 /dev/datadev_0.ko
source /reg/g/psdm/etc/psconda.sh
conda activate /reg/neh/home/cpo/.conda/envs/timetoolLab2/
source setup_env_template.sh
./updateProm.py --mcs_pri ~/Desktop/temp_mcs/TimeToolKcu1500_primary.mcs --mcs_sec ~/Desktop/temp_mcs/TimeToolKcu1500_secondary.mcs

then hard reboot machine

(computer lcls-pc83236 loads the driver on boot using /etc/sysconfig/modules/datadev_0.modules as of 7/10/2018)

TimeTool Software Files

lcls2-pcie-apps/firmware/applications/TimeTool/python/TimeTool.py: a description of the "addValue" register

lcls2-pcie-apps/firmware/submodules/surf/python/surf/protocols/clink/*.py: descriptions of the "clink" (cameralink) parameters

lcls2-pcie-apps/software/TimeTool/python/TimeToolDev.py: top level class called by gui.py

.  opens /dev/datadev_0, glues together various register maps using classes like ClinkTest, TimeToolCore, dataWriter

 

conda activate /reg/neh/home/cpo/.conda/envs/timetoolLab2/      #if on SLAC network
conda activate timetoolqt5                                      #if on LCLS network
 
source setup_env_template.sh
./scripts/gui.py

 

In TimeToolDev.py:

  • self.add adds registers and associated GUI control

Settings Needed To Run Camera

Use channela to talk to the front-end board in the "Variables" tab:

  • linkmode: medium
  • datamode: 8bit
  • framemode: our camera only gives a line valid (indication that there is valid data) so we need to set to "line"
  • tapcount: our camera sends 4 bytes where it could send 6
  • baudrate: 9600
  • send escape right away after powering up
  • use sendGCP to test serial link is working (output should appear on terminal)
  • swcontrolvalue/swcontrolen: bits for hardware vs software trigger (0/0 internal trigger).  These bits control CC1 through CC4.  In external trigger mode, camera triggers on falling edge of CC1.

Now tell the camera how to send data using these three-letter-commands (TLC) in the "Commands" tab:

  • CLM 1 (cameralink medium mode)
  • SVM 1 (test pattern ramp).  NOTE: with the test pattern, the camera always seems to read out at the internal line rate, even in external trigger mode.  To have external triggering controlled only by CC1, need to set SVM to 0.
  • SSF 1 (software trigger rate 1Hz, although seems to read back as 6Hz? and 2 reads back as 12Hz?)
  • SEM 0 (Exposure Mode: internally controlled exposure time)
  • SET 20000 (Exposure Time: ns exposure)
  • STM n (Trigger Mode: 0 for internal, 1 for external ... we're guessing there is a typo in the manual where the values should start from 0)
  • SPF 2 (turns to 12 bit. 0 is 8 bit)

Now enable triggers:

  • in "Variables" tab set DataEn for channel A to true
  • Framecount field should increment at 6Hz
  • Dropcount field counts "3 channels misaligned" errors, and so should stay at 0

Camera output should appear on terminal after setting ClinkTop->ChannelA->DataEn to True

Output of SendGCP:

Got Response: 
Got Response: Model          P4_CM_02K10D_00_R
Got Response: Microcode      03-081-20296-13
Got Response: CCI            03-110-20294-03
Got Response: FPGA           03-056-20470-03
Got Response: Serial #       12102856
Got Response: BiST:          Good
Got Response: 
Got Response: DefaultSet     1
Got Response: Ext Trig       Off
Got Response: Trig Overlap   Off
Got Response: Line Rate      1 [Hz]
Got Response: Meas L.R.      6 [Hz]
Got Response: Max  L.R.      19607 [Hz]
Got Response: Exp. Mode      Timed 
Got Response: Multi Exp. Mode   Off 
Got Response: Exp. Time[0]   50000 [ns]
Got Response: Meas E.T.[0]   50000 [ns]
Got Response: Max  E.T.      3000500 [ns]
Got Response: 
Got Response: Test Pat.      1:Ramp1
Got Response: Direction      Internal, Forward
Got Response: TDI Stages     2
Got Response: Vert. Bin      1
Got Response: Hor. Bin       1
Got Response: Flat Field     Off
Got Response: Offset         0
Got Response: System Gain    1.00
Got Response: Mirror         Off
Got Response: AOI Mode:      Off
Got Response: Scan Type      Line Scan
Got Response: CL Speed       85MHz
Got Response: CL Config      Medium
Got Response: Pixel Fmt      8 bits
Got Response: CPA ROI        1-2048

Notes From Matt On Evr Firmware

wrapper to transceivers:

https://github.com/slaclab/lcls-timing-core/blob/release-lcls2/LCLS-II/core/rtl/TimingGthWrapper.vhd

decoding the output:

https://github.com/slaclab/lcls-timing-core/blob/release-lcls2/LCLS-II/core/rtl/TimingCore.vhd (this link is bad. use link below instead.)

https://github.com/slaclab/lcls-timing-core/blob/589c8bc10d02a0cdcf08999b254d2ed071c87577/LCLS-II/core/rtl/TimingCore.vhd

this record has a strobe that says when it's valid:
appTimingBus : out TimingBusType;

has the record structure:

https://github.com/slaclab/lcls-timing-core/blob/release-lcls2/LCLS-II/core/rtl/TimingPkg.vhd (this link is bad. use link below instead.)

https://github.com/slaclab/lcls-timing-core/blob/4de46d35c3536879b0fb4deff6a70b241a0a67e0/LCLS-II/core/rtl/TimingPkg.vhd

strobe : sl; -- which clock cycle it is valid
valid : sl; --
message : TimingMessageType; -- for lcls-II
stream : TimingStreamType; -- for lcls-I (eventcodes in this record)
v1 : LclsV1TimingDataType;
v2 : LclsV2TimingDataType;
modesel : sl; -- LCLS-II selected -- tells us the mode, another register sets it

axi-stream: amba-xilinx-interconnect: no address involved, like a port, push/acknowledge
axi: full memory interface: address/values and can burst multiple values
axi-lite: used for register interfaces: 32-bit value with address

What we learned about timing stream:

eventcode is part of TimingStreamType
timingstreamrx (timing message) and timingrx (timing message) have outputs of type timingstream
timingrx instantiates both LCLS-I and LCLS-II timing streams:

  • lcls1 is timingstreamrx
  • lcls2 is timingframerx

timing core instantiates timingrx

TimingCore outputs TimingBusType and TimingStreamType is a member of this and has the eventcodes.

timingcore is instantiated by EvrFrontEnd

EvrFrontEnd is instantiated by Hardware.  Hardware has TimingBusType as a local variable.  This is the highest it gets.

Hardware is instantiated by the TimeToolKcu1500

 

type AxiStreamMasterType is record

tValid : sl;                                           data is ready to be clocked in or out (equivalent of write enable)
tData : slv(127 downto 0);                  the actual data that the FIFO transmits
tStrb : slv(15 downto 0);                    
tKeep : slv(15 downto 0);                   which bytes from the tData to keep
tLast : sl;                                            identifies the last tData of the frame.

Everything in the following lines are axi extensions called (slac streaming interface) ssi
tDest : slv(7 downto 0);                      identifies the destination for when using multiple axis on a single bus
tId : slv(7 downto 0);                          transaction id for handshaking (validation the signal was )
tUser : slv(127 downto 0);                 user bits. can be used for anything. have been used for start of frame (SOF) and end of frame errors (EOFE), 
end record AxiStreamMasterType;

Thoughts on Controlling Hardware Trigger From Ryan (04/11/18)

  • Should not mock-up an axi-write every time we receive the right event-code
  • Currently the SWControlEn/SWControlValue are direct axi writes to front-end board
  • Will need to change front-end board firmware
  • TxControl in KCU1500 has 8-bit field and valid that needs to be set to communicate to front end board.  Will map this to CC in front-end board.
  • TxControl goes through Pgp2bAxi module.
  • Can control these registers both with software (for debugging) or hardware.

Tracking Down Timing Fiber Input

  • pgplanewrapper.vhd has this code:
  • evrRxP(1) <= qsfp1RxP(3);
    evrRxN(1) <= qsfp1RxN(3);

    evrRxP(0) <= qsfp1RxP(2);

    evrRxN(0) <= qsfp1RxN(2);

  • not sure why 2 EVRs (maybe lcls1 and lcls2?)
  • other 6 lanes are all hooked up to PGP
  • pgplanewrapper also hooks up the 6 dmaObMasters/dmaIbMasters etc to the pgp lanes (last two are unused, because they are used by evr)
  • timetoolkcu1500.vhd hooks dmaIbMasters(0) to the AxiStreamTap, so our camera data is on qsfp0[0], which is the first of the 6 pgp lanes
  • timing fibers are on the last two fibers of qsfp1

Setting Up TimingCore

  • under "commands" tab need to ConfigLclsTimingV1.  this controls a multiplexer that routes the evr clock/data to the fpga, either from lcls1 or lcls2
  • under "variables" tab GtLoopback 0 is normal mode, 2 is internal loopback which includes EVG simulator sending some opcodes, 4 is a later loopback as described on page 85 of Xilinx ug576 guide https://www.xilinx.com/support/documentation/user_guides/ug576-ultrascale-gth-transceivers.pdf
  • mmcm is like a souped-up old dcm (digital clock manager) able to generate many more frequencies using, with jitters ~25ps (comparable to a standard external oscillator)
  • each of qsfp1 input 2 (counting from 0) can only accept lcls1 timing (static) and input 3 can only accept lcls2 timing.  Possibly it's switched, but I don't think so.
  • in "variables" tab TimeToolDev->HW->TimingCore->EvrCore, sofCount/eofCount/FidCount should all increment at 360Hz if things are working.

Using Xilinx Integrated Logic Analyzer (ILA, aka "ChipScope")

  • use the vivado core-generator to generate an ila_0 core.  Matt says use the generated .xci file, not the .dcp file
  • add the .xci file to appropriate ruckus.tcl with something like this:
    • loadIpCore -path "$::DIR_PATH/coregen/ila_0.xci"
  • instantiate an ila_0 component in vhdl
  • build and program the fpga
  • open vivado (currently on rdsrv223)
  • connect over jtag
  • find .ltx file generated by build with ila.  this is a list of signals that are exported to the ila.
  • in vivado click on the fpga (e.g. xck115_0) in top left
  • in the "hardware device properties" enter the .ltx file in the "probes file" field
  • a list of ila's should appear along with waveforms in the display

 

  • also needed:
  • under "commands" tab need to ConfigLclsTimingV1.  this controls a multiplexer that routes the evr clock/data to the fpga, either from lcls1 or lcls2

see ILA core generation visual step through for walk through on how to setup.

Simulation tools

https://www.xilinx.com/support/documentation/sw_manuals/xilinx2018_1/ug937-vivado-design-suite-simulation-tutorial.pdf

(as of 06/21/2018, simulation tools use C library that's only present on RHEL 7 machines.)

 

SURF website

https://slaclab.github.io/surf-doc/surf_1_documentation/html (this website is no longer maintened (6/21/2018) use github link below instead)

https://github.com/slaclab/surf

 

pyrogue python scripting example

how to use pyrogue from python shell


debugging timing in firmware

look for .rpt files in the 

build/TimeToolKCU1500_project.runs/impl_1/*.rpt

and in 

build/TimeToolKCU1500_project.runs/synth_1/

grep -A 3 VIOLATED TimeToolKcu1500_timing_summary_routed.rpt | grep "Destination:"

 

presentation / demo

https://docs.google.com/presentation/d/1RMB0pxKQXMMOqtIbARzkCOirL6SD_WuWWGkIX7bnh2M/edit#slide=id.g42d279e452_0_143

visual notes

https://docs.google.com/presentation/d/1QaUZmsM9fOA6M3zlxPCV7yDk4sWt6NioPboEcoP8mi4/edit#slide=id.g433a0ab090_0_5

 

Matt Weaver's LCLS2 time stamping

https://github.com/slaclab/l2si/blob/master/firmware/common/base/rtl/EventHeaderCache.vhd

 

AxiStream Batcher Protocol Version 1

To do list

  • prescaling data by x amount over register
  • simulation of modules inside TimeToolCore.vhd
  • peak-finding (final parameters)
  • programming weights by axi-stream
  • emulate LCLS2 timing system in firmware
  • divide by the delayed-minus-pedestal (use LUT?)
  • virtual evg. (should only require gui tweaks. check with ILA)
  • how to handle the git large files
  • get rid of gui
    • gui fully removed,
    • still need to make smart and stream line
  • programmable event-code & trigger delay (axi bus)
  • emulate epix model (send timestamp to front-end board. time stamp should be at the beginning of image packet.).  later:  maybe this isn't necessary because in future we would support camlink-over-fiber cameras with no front-end board?
  • save hdf5
  • full/deca mode (only 8 bit for deca or full)
  • get test stand in 901 working
  • start with Matt matched filter algorithm or Abdullah algorithm
  • feedback results to Joe Frisch as udp packets or accelerator-style-pgp
  • is 8-bits OK for deca? (answer: yes, for interferometry mode)
  • understand how system behaves at high rates (do we drop frames? timestamps correct?)
    • reuse matt's firmware for timestamping
    • add in LCLS2 timing (triggering)
    • send full signal to Matt
    • switch from python to C
    • think about running the camera all the time (to send feedback data continuously)
  • difference kc705  revisions 1.1 vs 1.2.:  no significant difference according to https://www.xilinx.com/support/answers/59751.html.  Either an FMC problem, or subtle timing issue, or need to specify board rev somehow when synthesizing?
  • interface to DRP?
  • new front-end boards
  • touch base with Ryan on code structure issues
  • prepare for running in LCLS-I?  If yes:
    • spy on the timestamp multicasts in software (maybe not necessary)
    • take real pictures with lens from Ryan
    • trigger delay and event-code programmable via AXI
    • eliminate gui and save hdf5
    • put all setup commands in python script
    • do we need feedback for laser? (no)
    • run at 120Hz  overnight and validate
  • resolve vhdl axi-lite offset constants

 

Algorithm Conversation With Giacomo (March 6, 2019)

  • Giacomo is doing spatial, Ryan is spectral (roughly speaking)
    • both are interferometric, so the background we had in LCLS1 is mostly gone, but still have some background from stray light, for example
  • Formula that should work for both:  (S-B)/(S(delayed)-B(delayed)). (note: division is similar to normalized subtraction:  (S-B)/B = (S/B)-1
  • B is different for S and S(delayed) (background at time "delayed" is taken)
  • Giacomo thinks we won't need the "delayed" subtraction/division for 80% of the experiments
  • Giacomo will ensure that we run in a regime where (S(delayed)-B(delayed)) is not close to zero
  • sxrx34917 (being analyzed by Giacomo and Stefan Droste).  this data is "zoomed out" in the time dimension, but will be more zoomed-in for LCLS-II.
  • S(delayed) comes from moving the laser out of the time window, and is done "offline" according to Giacomo every few hours
  • Giacomo wants a fit to the minimum of dI/dt (derivative of I-B) to a parabola
  • Giacomo thinks that the feedback will be slower (millisecond level).  Contradicts what we learned before where Joe Frisch was going to do fpga stuff with the laser-locking system
  • algorithm should be same for spatial/spectral
  • Giacomo wants a way to look at S-Delay when the timing edge is out of the window.  Tells them if it's on the "right" or the "left".
  • Questions for Giacomo:
    • the edge looks big in sxrx34917, do we definitely need to divide? Answer: no, as long as we can set an ROI.  i.e. don't need
    • does the "delayed" contain background from stray light, for example?  Answer: "delayed" is the pump laser delayed off the end of the camera image
    • is B just pedestal, or does B include laser-pump?  If it's laser-pump, do we need to use IIR to compute B?  Answer:  B is remaining background (same as LCLS-I) not eliminated by interferometric approach.
    • do we use the same B in the numerator/denominator?  Answer: no

Conversation with Ben Reese (2/13/20)

feb fpga version on feb. 13 2020:
ClinkFebPgp2b_1ch built Tue 16 Jul 2019 10:39:15 AM PDT by ruckman on rdsrv222

_TimeToolKcu1500Root.py has register settings
https://github.com/slaclab/lcls2-timetool/blob/master/software/config/config_20200207_152212.yml

ConfigureXpmMiniSim
TimingRx->TimingPhyMonitor->UseMiniTpg (selects minitpg)
(check clock freq's under TimingPhyMonitor)
switching to real timing need an xpm txreset: see this as RxClkFreq not being the same at TxClkFreq
pipelinedepthfids=90
config_l0_select_ratesel=0x7
7 = 1Hz
5 or 4 is 100Hz
confg_l0_select_enabled=1 starts triggers
TriggerEventManager->TriggerEventBuffer[0]
- has l0 counts
- has xpmpause/xpmoverflow (too much data). xpmoverflow is a sign backpressure not working well enough: reduce pausethreshold. xpmoverflow is latched version of fifooverflow. permanently latch deadtime. needs a clearreadout to reset.
- missing clearreadout going to front end board
- pausethreshold sets when to send deadtime to XPM
XpmMessageCount increments @1MHz
clinkfeb[0]->->clinktop->trigcontrol[0] has a trigger rate
TriggerEventManager->TriggerEventBuffer[0]->masterenable enables triggers going to frontend
TriggerEventManager->TriggerEventBuffer[0]->eventbufferenable enables the fifo that receives the timing frames (and sends the\
m to the batcher eb)
mostly masterenable/eventbufferenable should go together?
eventbufferenable first, masterenable second

xpmmini is only partition(rog) 0.

to switch to realxpm:
- Kcu1500Hsio->TimingRx->ConfigLclsTimingV2
- toggle UseMiniTpg to false

watch kcu1500hsio->timingrx->timingframerx->RxDown (resets to 0 on ConfigLclsTimingV2, latches when link drops)

todo:
- clear readout (workaround TimeToolKcu1500->Application->AppLane[0]->EventBuilder->Blowoff)
- trigger 10us feedback
- edge calculation

 

  • No labels