February 2023 GPS outage

This page documents what happened, and shows attempts to see if there are any bad consequences. Executive summary – no there aren't, but yes we can see the effect of the GPS outage on the Crab pulsar phases. But not Vela and not the gamma-brightest most stable MSPs.

Here's the big email thread:

Hi David:

The affected interval was a bit longer than 31 hours, from 13:18z on 22 February until about 22:28z on 23 February without an accurate 1 PPS pulse from the GPS receiver. The onboard navigational model was reinitialized from the GPS receiver’s position/velocity a couple of minutes after the nominal receiver operation was restored.

The attached plots from the LISOC website, which only processes telemetry which was declared to not be ITAR-sensitive prior to the launch, show the more precise times of the recovery in both timing and position, as well as the magnitudes of the accumulated errors.

It turns out that the techniques that the LAT DQM shifters are told to apply to determine the severity of the usual (i.e. much shorter) GPS outages apply here as well. The plot of swhwudlcpuperiod shows the adjustment of the period of the *spacecraft* 1 PPS pulse, which is supplied to the LAT, to bring the phase of that pulse back into alignment/lock with the GPS receiver 1 PPS pulse once the quality of the latter met the usual requirements (number of tracked satellites, DOP, etc.). The time of the sharp leading edge of the pulse corresponds to the time when the quality requirements were first met again after the outage. The plot of swhwudlgpssubsec shows the phase error, in units of 100 ns clock ticks, between the two 1 PPS pulses. You can see that the phase error – i.e. the timing error that had accumulated during the outage – as measured just before the quality was judged to be good enough to use to bring the two pulses back into phase alignment, was between 28 and 29 microseconds. In other words, if you assume a linear drift rate, then the 1 PPS timing reference to the LAT was drifting away from the true time at an average rate of slightly less than 1 microsecond per hour over the 31+ hours of the outage. This drift rate is not shockingly out of family with data from the usual, shorter outages, but it’s certainly well to the low side of the distribution of the drift rate during those outages. It appears that we were somewhat lucky in that the baseline period just prior to the onset of the anomaly, which is used to extrapolate forward during the outage, turned out to be highly representative.

The remaining plot is of one of the Cartesian components of the spacecraft position – i.e. part of the usual 1 Hz reporting from the spacecraft to the LAT in the “magic-7” packets. This shows the jump in that position when the onboard navigational model was reinitialized several minutes after normal GPS receiver operation was restored. The rough order of magnitude of the jump here appears to be around 20 km, and of course you have to subtract the slope of the background from this. However, as mentioned in a general email to the latops and datamon distribution lists Wednesday evening (while the outage was still in progress), there was an earlier action taken to correct a significantly larger position error and reduce the subsequent rate of error growth. That action caused a jump in the spacecraft position reported to the LAT whose magnitude appeared to be in the 130-140 km regime (although I only eyeballed the components and concluded that the Y component had the largest jump, which was about 132 km before subtracting the background slope). The time of that action was around 23:25:05z on 22 February – i.e. about 10 hours after the initial anomaly. (The earlier email said something more like 23:23z.) If I’m doing the arithmetic correctly, a position error of 135 km is about 450 light-microseconds. It therefore appears that – at least with an unfavorable barycentering geometry – the position error may be of greater concern than the local timing error at the LAT. Let me know if you need help obtaining details about the components of both jumps. . .although I suspect that you’re more proficient than I am at these things (and I also suspect that people can pull information about them from places like FT2 files).

The root cause of the GPS receiver anomaly is not known, but is believed to be related to some radiation-induced upset. Anomalies with similar symptoms in receivers of the same model and similar vintage have been reported.

Recovery was ultimately accomplished after power cycling and reinitializing the receiver – following roughly the procedure utilized when initializing the receiver a day or two after launch, and again a few months later after installing revised receiver software. (You may recall that software update, which was created to eliminate some timing jumps encountered every few days in the receiver’s 1 PPS output, where the magnitude of the jumps was several hundred microseconds, and dependent upon the number of GPS satellites being tracked by the receiver at the time of the jump.)

I don’t see any enhanced need to worry about the future. So far there is no indication that a degenerative process is at work – although the statistics are obviously small.

Your guess is as good as mine with regard to the health of the backup GPS receiver – it hasn’t been turned on since the spacecraft was on the launch pad. (It still has the original software in its EEPROM.)

Let me know if you have further questions. Yours in the fellowship of the aubergine. . . Best, ejs

From: David A. Smith <smith@cenbg.in2p3.fr>
Sent: Friday, February 24, 2023 6:43 AM
To: maldera@to.infn.it; Hays, Elizabeth (GSFC-6610) <elizabeth.a.hays@nasa.gov>; Cameron, Rob <rac@slac.stanford.edu>; Racusin, Judith L (GSFC-6610) <judith.racusin@nasa.gov>; Ojha, Roopesh <roopesho@slac.stanford.edu>; Siskind, Eric J (GSFC-4440)[NYC REALTIME COMPUTER] <eric.j.siskind@nasa.gov>
Cc: bruel <Philippe.Bruel@llr.in2p3.fr>; Giacomo Principe <giacomo.principe@inaf.it>; Kerr, Matthew T CIV USN NRL (7655) Washington DC (USA) <matthew.kerr@nrl.navy.mil>
Subject: Re: [EXTERNAL] Re: LAT DQM is showing the lack of timing data and GPS data

Le 24/02/2023 à 12:20, Simone Maldera a écrit :

ok, I will set the ft2 quality flag to 2 for the affected time interval (the same value that was used in the past for a timing issue).

We will also mark these runs as "good with bad parts" and set the GPS flag to "bad" in the run quality database.

From David:

Thanks everybody for your work. Roughly how long is the affected interval?

Do we know why the GPS stopped working? And then started working? Need we fear for the future? Do we think that the backup GPS is working fine?

I can imagine that for a gamma bright MSP with a stable rotation ephemeris ; or the Crab (or Vela?) perhaps, with a radio ephemeris covering the outage ; that we might be able to show gamma-ray phase going astray during the outage. But mostly I suspect that the interval is so short that no measurements we really care about will ever be affected by this incident.

Perhaps Matthew and/or I will attempt a "post mortem" in the weeks to come. Best, David.

The three plots that Eric attached to his message:

Confluence and Jira now require federated login. Read more.

Child pages

Here's the big email thread: