Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The XPM link went bad again, and recovered with TxPhyReset.

Questions for Larry/Ben:

...

Ben pointed out that in the case of failure, these L0Count/L1AcceptCount counters increment at the right rate, but TxOpCodeCount doesn't, so not all L1's are getting translated into TxOpCode's (!), so it's a problem in the KCU.  Appears to contradict the camera-powercycle approach above, but perhaps that is caused by the exposure time getting set smaller so it can handle messed up delays?  (see L0Delay solution below):

Image Added

Questions for Larry/Ben:

  • does clearreadout trigger blowoff and fiforeset? (and reset the latched xpmoverflow)
  • why would TxPhyReset help the xpm link?
  • is TxOpCodeCount the right thing to watch?
  • why doesn't send string work for the opal?  Answer from Larry:  need to prepend an "@", e.g. "@SN?" seems to work.

Try moving back to old timing values.  Current settings:

Code Block
(ps-3.1.16) tmo-daq:lcls2$ pvget DAQ:NEH:XPM:0:XTPG:CuDelay
DAQ:NEH:XPM:0:XTPG:CuDelay 2020-10-21 09:33:26.010  134682 
(ps-3.1.16) tmo-daq:lcls2$ pvget DAQ:NEH:XPM:0:PART:4:L0Delay
DAQ:NEH:XPM:0:PART:4:L0Delay 2020-10-21 09:33:26.334  10 

Matt writes that the pre-Oct.-15 values were:

Code Block
These were the values prior to 10/15
pvput DAQ:NEH:XPM:0:PART:3:L0Delay 90 (readout group 3)
pvput DAQ:NEH:XPM:0:XTPG:CuDelay 160000
Changes after that are in the tmoc00118 logbook.

Do this appeared to fix the problem!:

Code Block
(ps-3.1.16) tmo-daq:lcls2$ pvput DAQ:NEH:XPM:0:PART:4:L0Delay 90
Old : 2020-10-21 09:33:26.334  10 
New : 2020-10-21 12:48:30.590  90 
(ps-3.1.16) tmo-daq:lcls2$ pvput DAQ:NEH:XPM:0:XTPG:CuDelay 160000
Old : 2020-10-21 09:33:26.010  134682 
New : 2020-10-21 12:48:58.977  160000 

It turns out the CuTiming didn't mater, only the L0Delay changed fixed it.  From the opal log file:

Code Block
I guess this is output from the logfile when it fails:  triggerDelay 17810
And when it succeeds: triggerDelay 1810 (after changing L0Delay to 90)

Made it feel like we're running out of bits in firmware for triggerDelay, but (a) rogue reads back the register correctly (b) Matt says he thinks we have 32-bits

...

.

Not Understood Failure Modes

...