Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This page describes the design of CAL crystal recon.  In particular, it is intended to contain the definition of the xtal status word

It includes some discussion of mitigation of the failure of the HE channels of a single GCFE.  There are 3072 GCFEs in CAL, two for each of the 1536 xtals. The  The problem occurred in the run starting at MET = 301753824, which is MJD 55402.52108796, or 2010 Jul 25 at 12:30:22 UTC, or 2010 day 206 at 12:30:22 UTC.

Symptoms of the failure

The problem occurred in the run starting at MET = 301753824, which is MJD 55402.52108796, or 2010 Jul 25 at 12:30:22 UTC, or 2010 day 206 at 12:30:22 UTC.

On 27 Jul 2010 (Day 208), at 1:55 PM EDT, Anders Borgland wrote:

Starting with run 301753824 we have two problems:

1/
We do not see any signal in the high energy diode in tower 4, layer X1, column 4, + side. From Eric Siskind: "Whether the failure is within the diode itself or in the GCFE electronics chain (HE preamp or slow shaper) is currently unknown". You can read the whole thread here:

https://www-glast.stanford.edu/protected/mail/datamon/4835.html

2/ While problem 1/ affects all (high energy) events in that channel, GCR events (4-range and zero-suppressed events) tickle FSW bug 1156:

https://jira.slac.stanford.edu/browse/FSW-1156

This means that about 10 events per run will fail in the decompression. Because of the way the Halfpipe works we lose the complete datagram for each of these events. Since a datagram contains about 110 events we are currently losing about 1100 events per run. This corresponds to about 2.5 seconds of data for each 90 minute run.

The FSW group have a fix for bug 1156 and will upload a new build asap.

Note that currently there is no failure mode in CalRecon so events from this channel is not treated in any special way. NRL is working on this.

It should also be noted that the problem was caught immediately by two separate parts of the Data monitoring. The automatic alarms caught both the missing datagrams and the missing signal from the diode. These runs are marked as 'GOOD' by the DQM shifter, but with a comment attached to them. Obviously we will have to live with the missing diode signal from now on.

Some of you will not have failed to notice the irony that it's GCR events tickling FSW bug 1156 (hint: SSC-258) (smile)

anders

What has failed?

As soon as we understand exactly what has failed in this GCFE, I'll type something here.

Consequences of this failure

On 27 Jul 2010 (Day 208), at 2:15 PM EDT, J. Eric Grove wrote:

Additional clarification:

Wiki Markup
We do not see any signal in the high energy diode in tower 4, layer X1, column 4, + side. From Eric Siskind: "Whether
\[\]
Note that currently there is no failure mode in CalRecon so events from this channel is not treated in any special way. NRL is working on this.

What this means is that (until xtal recon is fixed):

  1. Any photon that MISSES this one crystal is not affected. It is correctly and properly reconstructed.
  2. Any photon that deposits LESS THAN about 1 GeV in this one crystal is not affected. It is correctly and properly reconstructed.
  3. Any photon that deposits MORE THAN about 1 GeV in this one crystal has an incorrect energy and position measurement in this one crystal, and therefore has an incorrect reconstructed incident energy and direction. The level of error in reconstructed incident energy and direction is surely energy-dependent, and I don't have an estimate of the magnitude yet.

In the above sentences, "photon" means "any event that is not read out by Trigger Engine 4, i.e. any event that is not read out in 4-range, zero-suppressed mode". I used the word photon to focus the discussion.

The combination of cases (1) and (2) covers the overwhelming majority of photons in the LAT dataset, so most events are perfectly fine, but clearly we need to implement a fix for this particular failure in the code that reconstructs crystal energy and position.

Eric

Once CAL xtal recon has been modified with the changes outlined below, the performance of this xtal will be essentially nominal, and the performance of the LAT will be unaffected For more details, see ?CAL July 2010 channel failure.

Changes necessary to recon

...

I've divided the 32 bits into 8 16 for recon status (and used 510), 16 10 for hardware status (and used 8 so far), and 8 5+1 for config status (and used 4). Perhaps it'd make more sense to reserve more bits for recon status. Note that I've included two bits for Bill's mods to the longitudinal position calculation to indicate whether he's corrected for direct light in the near diode or corrected for an ambiguous asymmetry. Those can be set in the initial CalXtalRecAlg pass and used as desired later.

...

0

bad energy

1

bad longitudinal position

2

energy has been provided by external means

3

energy has been calculated by failure mitigation algorithm

4

energy has been calculated for corrected longitudinal position

5

minus-face energy measurement is saturated

6

plus-face energy measurement is saturated

7

position has been provided by external means

8

longitudinal position has been corrected for direct light

9

longitudinal position has been corrected for ambiguous ratio

10-15

unused

status of h/w

8 16

bad minus-face LEX8

9 17

bad minus-face LEX1

10 18

bad minus-face HEX8

11 19

bad minus-face HEX1

12 20

bad plus-face LEX8

13 21

bad plus-face LEX1

14 22

bad plus-face HEX8

15 23

bad plus-face HEX1

1624-2325

unused

status of config

24 26

minus-face LE autoranging disabled

25 27

minus-face HE autoranging disabled

26 28

plus-face LE autoranging disabled

27 29

plus-face HE autoranging disabled

2830-31

unused

Algorithms for correcting current and not-unlikely future failures in FixXtalResp

...