Page History
...
action | result | remedy | result |
---|---|---|---|
Remove XPM10 fiber timing in the back while DAQ running | *** XpmDetector: timing link ID is ffffffff = 4294967295^M | TxlinkReset of cmp015 in XPM11 | DAQ recovers |
Repeat XPM10 fiber timing removal removal | DAQ cannot disable | --- | DAQ recovers by itself at restart |
Repeat XPM10 fiber timing removal removal | --- | --- | no issue |
Repeat XPM10 fiber timing removal removal | DAQ cannot disable | --- | DAQ recovers by itself at restart |
Remove XPM10 fiber timing in the back while DAQ stopped | --- | --- | DAQ starts with no issue |
Repeat XPM10 fiber timing removal removal while DAQ stopped | --- | --- | DAQ starts with no issue |
Remove transceiver from XPM10 in the back (DAQ stopped) | --- | --- | DAQ starts with no issue |
Remove transceiver from XPM10 in the back (DAQ started) | --- | --- | DAQ starts with no issue |
timing 1 shutsdown by itself | TXlinkReset on XPM10 for XPM11 | DAQ recovers | |
Remove fiber on XPM10 to XPM11 | --- | --- | DAQ starts with no issue |
Remove transceiver on XPM10 to XPM11 | --- | --- | DAQ starts with no issue |
Remove fiber on XPM11 AMC0 port 0 | --- | --- | DAQ starts with no issue |
Remove transceiver on XPM11 AMC 0 port0 | --- | --- | DAQ starts with no issue |
opal disappears from the list f detectors | restart DAQ | DAQ starts with no issue | |
power cycle xpm10 via switch only AMC0 | XPM 11 looses timing node | Restart pyxpm 10 and 11 restart pyxpm 11 | DAQ restarts but opal shutsdown |
opal still shutdown | devGui xpmmini timing v2 | no avail | |
Stop pyxpm 10 and 11 | DAQ starts with no issue |
Conclusion
...
It appears that yanking the timing fiber can cause disturbances in the system, but they are not repeatable 100% of the time.
XPMs Power spikes can set the DAQ in a behavior similar to the XPM glitch, but only if pyxpms are running. To be repeated.
Upgrading XPM firmware seems to have mitigated all the issues (to 3.6.0 from 3.5.4). The bucket issue becomes more prominent, probably because other issues are not happening. This issue appears when power cycling the xpm11.
Brainstorming Session
Nov. 16, 23 with mona, dan, weaver, caf, claus, melchior, cpo
proposal:
- move ric/mona/christos to xpm10 (for the future)
- give riccardo the whole system for the day and he messes with xpm10
- add startupMode=1 kwarg to opal
new xpm firmware (leaving xpm10 alone, no xpmmini->lcls2 hack):
riccardo can't reproduce the errors, except for bucket skipping
(txlinkreset fixed it for matt, but not riccardo and ric)
old xpm firmware (also messing with xpm10 with xpmmini->lcls2): riccardo could reproduce
xpm link glitch and txlinkreset (once) and (likely) xpmmini issue
theories:
- maybe ConfigLclsTimingV2 isn't reliable (should perhaps poll
on something like rxid!=0xffffffff)
- either new xpm firmware makes things better
- or we need to mess with xpm10 to reproduce problems
- or we're unlucky and can't reproduce (or we're not doing the right
things to reproduce)
- might need a minimum length of time to tickle the issues (matt says
try 30 minutes to 1 hour)
matt has an idea for bucket-jumps. could direct julian.