Status

Slack conversation with Ric on Sep 15 2023 about the status of bypass events.

Hi Ric, you’ve probably told me this before, but after this morning’s conversation you had with Matt about the common-ROG I think I need to be reminded:  I believe you added in this “bypass event” idea where if an event didn’t have the common-ROG (e.g. due to deadtime, which isn’t synchronized across readout groups) we would survive.  But when Matt ran last night without a common-ROG we didn’t survive.  Does that mean there is a limit on the fraction of bypass events we can have, and/or are non-common-ROG events marked damaged?

Hi Chris.  If I remember correctly, bypass events were causing Mona and the offline trouble, so we agreed to get rid of them.  Since then, events with an env in which the common RoG bit isn’t set are discarded very early on and counted in the nNoComRoG errors that you can see in the Error counts plot on the grafana page (below the Damage plots).  I don’t think I can see a reason why we would have died on a run without a common RoG.  I would have expected that we would have run fine, but no L1Accepts would have come through.
I think I was misremembering the situation in the meeting this morning.  What I was trying to say was that if we were to send events through the system without the common RoG having fired, the TEB could potentially build them out of time order.  Out of time order events cause aborts in various places in the system.

Matt writes about the rix run that didn't have a common ROG: The run proceeded, but I got nervous seeing all the bld events damaged, so I went ahead and made the common readout group, which only included timing.

Description

  • these events happen if the highest-rate ("primary") readout group is missing in the env
    • They're called "bypass events" because the bypass the TEB
    • These events can be L1Accepts and Transitions
    • These events are caused by the primary readout group experiencing deadtime
  • they make it difficult for ric's event-builder because the highest-rate readout group is used to keep things in time order
    • During periods of the primary readout group experiencing high deadtime, the event builder might be empty.  If it were to receive contributions from the secondary readout group during this time, they could be event built and dismissed before the next primary readout group contribution arrives in the event builder.  Since there is no reference for the event builder to know that an older event is coming, these events get out of time order.  This reference is provided by requiring all events to contain contributions from the primary (fastest) readout group so that time order is maintained.
  • we tried to deadtime-couple but the "TTL" triggers to the detectors are delivered at different times, so the coupled deadtime doesn't really work (full-signals from drp nodes aren't event built: treated as being instantaneous).  See Matt's timing diagram below.
  • Ric doesn't send bypass events to the teb, they just flow through and are recorded to disk.  These are not a problem, to first order.
    • Since L1 events learn of an available monitoring buffer in the TEB, bypass L1 events are never sent to an MEB
    • Transitions learn of an available monitoring buffer in the DRP and are always "broadcast" to all MEBs
  • Transitions without the primary readout group (bypass events) can cause the meb to crash
    • Matt can maybe eliminate them in firmware
      • he has an idea of running the highest-rate readout group with a low L0Delay (typically it's a high L0Delay for high-rate groups) and then use that to inhibit later "TTL triggers" to the detectors
      • requires a change to xpm logic to cache/couple the earlier group 0 inhibit to the later groups, and requires the timing segment level has lots of buffering (easy)
    • Ric is thinking about whether or not they can be eliminated in software (e.g. the PGPReader thread with an early release).  Might require that the buffer release code is re-entrant, since other threads may also be doing releases
      • This has been done and was released.  Bypass events don't exist anymore and events that triggered without the common readout group are counted and released early in PGPReader.  The counts of such "error" events can be found on the L2SI Grafana page below the Damage plots.  Such "error" events won't appear in the offline/live-mode.