Note: This procedure assumes that a change has already been approved by C&A, and that a JIRA has been entered documenting the desired change. Please document JIRAs in the comments, and note when they are closed.
6/17/19 - TS - I'll be updating this page to reflect what has been done and any outstanding questions on how to proceed. Feel free to answer any questions (marked in red) that come up if you know the answer. Also, please add in any steps or sub-steps. Items with by them are completed.
- STGEN-165Getting issue details... STATUS
Implementation Process:
- Define the name of the columns
- Update File Format Document & Interface Control Document.
- Implement the new columns in fitsGen, which means adding them to the template FT2 file: https://github.com/fermi-lat/fitsGen/blob/master/data/ft2.tpl
- Change the ft2 code. to get the velocities from the magic7 and save them in the new columns
- Create pipeline tasks to reprocess the FT2 data
- Create pipeline tasks to properly apply the Bad Time interval data.
- Test everything end-to-end with a sample of the new ft2 files. (by this I mean the pulsar software)
- Use the code to reprocess 1-2 weeks of data for testing.
- we reprocessed a two-week chunk of data from runs 545208791 to 546407418. This included several files with BTI data in the database
- more files were reprocessed with the P310 task(s). This includes a variety of cases for BTIs and a selection of different DATA_QUALs.
- Check to make sure the velocities are being copied correctly from the magic7 packets and that they make sense based on the change in coordinates - Don/Tom.
- Ingest files at FSSC - Don/Alex
- Verify that the BTI data is correct - ME/Tom/Simone
- Test with Pulsar Tools - Dave Smith
- Non-pulsar tests - Joe Eggen.
- Check with the C& A group. Don brought it up at the 6/24 C&A group meeting. No one raised any issues (or volunteered to do any testing).
- Check with other mission elements. Don checked with Michelle Hui at the GIOC that the change won't affect them.
- Paul Ray checked the files against the velocities computed by PINT and found good agreement.
- Use the code to reprocess 1-2 weeks of data for testing.
- Fix all the runs that need repiping (Runs to be rePiped and reprocessed): still in progress.
- Obtain a new GlastRelease:
- Remake the P310 tasks
- Re-reprocess a selection of files
- Verify that the geo lattitude and longitude are properly created (Toby will do this).
- Repeat the checks in item 5. above (spot-checks are probably fine).
- Verify that FT1 files produced with the new GR are correct/unchanged from the previous version
- Reprocess the FT2 files (Usually done in two steps. Initial reprocess & backfill):
- Pick an end date/time for the reprocessing window, once we all agree that we are ready to go.
- Create a list of runs for reprocessing and BTI flagging from the datacatalog. Validate against the list at the FSSC server.
- Install the reprocessing tasks in the PROD pipeline.
- The reprocessing step will be putting them into /Data/Flight/Reprocess/P310.
- FSSC will pick them up from xrootd for ingestion directly.
- The reprocessing will take ~1 month. Started 2020-03-04.
- Transfer the files to the FSSC and ingest them into the FSSC's data server.
- Have Wilko setup a proxy server so Don can copy the files directly from xroot.
- Don copies files and does basic checks and archives them.
- Alex starts the process to ingest them into the data server database. He estimates the ingest may take ~1 week.
- Create a new version of the L1Proc, rePipeL1 and Flag FT2 tasks.
- Update the International Geomagnetic Reference Field (IGRF). See Updating the IGRF-13 implementation in the astro package.
- Test them meaningfully on DEV.
- including check that the FT1 files haven't changed.(see Checking FT1 files in the context of the FT2 reprocessing and Bad interpolation of FT2 information during processing).
- Get CCB approval to make the change operational.
- Coordinate a switchover date between ISOC and FSSC.
- Create the backfill
- Reprocess 2020 runs with the new IGRF model.
- Switch to the new L1 task(s). Note: backfill and switch must be coordinated and timed together.
- Switch the data server to use the new files. The FSSC has an internal wiki page on the steps for this.
- Create the backfill
DataCatalog Commands:
datacat find --sort nRun --filter 'nRun>=239557414 && nRun<=585049107' --group FT2 /Data/Flight/Level1/LPA > P310_FT2_bulk.txt
To be updated with a newer upper range run when the cutoff date is chosen.
47 Comments
Elizabeth C Ferrara
Request for complete state vectors in FT2 file: https://jira.slac.stanford.edu/browse/STGEN-165
David A. Smith
Tom Stephens
David, is there any particular time range you'd like? i.e. specific MET range? I was going to launch into the reprocessing today and just pick a random 2 week interval (that covers some BTI-flagged runs, per Simone's comment below) but if you have a specific time range that would be interesting let me know and I'll do that.
David A. Smith
Tom, any 2-week period that suits you will suit me. Our rotation ephemerides cover pretty much the whole mission. I'll use bright pulsars, like Vela, for which 2 weeks will give adequate statistics for highlighting all-but-subtle bugs. Thanks.
simone maldera
Hi,
I would test some ft2 file that includes a BTI, just to be sure that it is correctly flagged in the reprocessed file. (but perhaps you already did this test)
Tom Stephens
I have checked a few but am going to do some more to be certain. It seems to all be working though.
Maria Elena Monzani
Hi Simone, do you have a nice list of files with BTIs? I had to dig a bit deep and only found one run with actual BTIs...
Tom Stephens
BTIRuns.csvThis has all the runs that have BTI data in the PROD database. I think I forgot to sort it when I exported it from the database but a quick sort command will fix that.
Tom Stephens
And these are the ones in the rage that I reprocessed:
545208791
545210684
545213810
545220436
545226530
545232555
545238531
545244491
545250444
545256394
545262253
545267657
545273446
545279151
545284857
simone maldera
Hi Tom,
I just had a quick look and I found some runs (eg 526563877 and 551001715) in your list for which the BTI was never applied to the Ft2 file.
I think this happens when the code for the solar flare suggests a bad interval, but then we decide to neglect it. In this case you find the BTI in the database, but the corresponding FT2 was not modified
Tom Stephens
Yes. that list is just everything that had entries in the database. Looking at the details of the data, those two runs never had the BTIs applied and wouldn't be applied if submitted to the reprocessing task. When I made the list, it was just everything that had BTI data in the database whether it was applies or not. I'm going to try to pull a better list of BTIs from the database to verify that it agrees with your manual list. Time to practice my SQL.
Maria Elena Monzani
I think that we can just submit all these runs for flagFT2. It runs in 30 seconds and it's a lot better to be comprehensive than to potentially missing a BTI.
Tom Stephens
That's definitely easier. That was my original plan but there seemed to be a push to get the list "right". I've verified that if the process doesn't do anything when it shouldn't so I'm all for just running all of them.
simone maldera
I updated this page using a list of BTIs that I have on my pc.
Maria Elena Monzani
Thanks Simone, this is massively helpful!
I'll reprocess some of these runs too. Most runs in the 54* range didn't actually have BTIs, as you explained.
I also found a BTI in run 578530168: does that agree with your records?
simone maldera
about run 578530168: it had some strange errors and I flagged a BTI, but then everybody agreed that it was just a processing problem (this run is one of the candidates for a repipe). So I think I can remove this BTI
Maria Elena Monzani
Thanks for explaining. I've reprocessed the run and all errors have disappeared. Removing the BTI now.
Maria Elena Monzani
I added ~15 runs from your list to the battery of tests for P130 in DEV. I also found a handful of runs that had escaped all our checks in the past and didn't have DATA_QUAL=0, so I created BTIs for them. They'll be picked up by P310 when we run it in PROD. These runs are: 320245802, 320245562, 320241362, 320237162, 433644656. i suspect we will find more when we get closer to the full reprocessing.
Tom Stephens
Don is downloading the 2-week chunk of files I reprocessed and will let people know when they are checked at the FSSC and ready to download and test with the tools.
Don Horner
David A. Smith
Here's a first quick result for Vela. Left is with the two new one-week 30s FT2 files, "classic" uses ftp://legacy.gsfc.nasa.gov/fermi/data/lat/mission/spacecraft/lat_spacecraft_merged.fits as I always have.
I'm gonna poke around a little more, but the first impression is that the results are identical.
Update 25 June: The 3rd plot uses the 2 weeks of 1 second FT2 files. The 2nd digit after the decimal point of the TS value changed by one unit. Meaning, same result but I'm sure it wasn't just the same files!
More in the coming days...
Paul Ray
I downloaded those FT2 files and modified the PINT code to use the velocities. The previous version used computed the velocities as a gradient of the positions. The new velocities agree with my computed ones to good accuracy (~1 m/s) when away from the endpoints of the file. So, I've commited the new PINT code to github that will use the velocities in the FT2 file whenever they are present. Thanks for implementing this improvement to the FT2 files!
Don Horner
Thanks for checking (and thanks to David too)!
Don Horner
I compared the two p310 weekly files to the current p202 files. The p310 file for week 515 starts about 6 hours after the p202 file. It looks like we are missing three runs (545187861, 545193566, and 545199272) at the beginning of the week. I imagine that might have a small effect on some results. Tom Stephens, can we get those runs reprocessed so the weekly files for 515 will be identical (except for the velocity column).
Tom Stephens
Sure, I can process those three runs It looks like they were just before the arbitrary start MET I picked.
Maria Elena Monzani
Hey Tom. FYI, the P310-FT2 and flagFT2-P310 tasks only take one argument. Example syntax:
Tom Stephens
Good to know. Thanks.
Tom Stephens
Maria Elena,
I processed those three runs through the P310-FT2 task and they came out with the data quality flags set to zero instead of one. Do they have to be run through the flagFT2 task as well? I believe doing so would fix them as they have BTI entries in the database setting their values to 1 but I would not expect that to be necessary by default.
Maria Elena Monzani
I made a mistake when running setL1Quality for these runs. You should be able to rollback the P310-FT2 task for them now.
Don Horner
The 515 weekly file has been updated to include those three missing runs at the beginning of the week.
Joseph R Eggen
I've tested the new FT2 files (for Mission Weeks 515, 516, & 517) by following the steps in the "Unbinned Likelihood" thread. All of the tools tested (gtselect --> gtlike) work without issue. I'm still running an instance of gttsmap, but that may not finish until sometime over the weekend. Still, it hasn't crashed yet so I consider that a good sign.
Furthermore, I've done a parallel analysis (same photon data, target, and time range) using a current spacecraft file fresh from the data server, and the results from the likelihood step are identical in both analyses. So far the tools are showing no issues with the new FT2 files.
simone maldera
Hi,
I checked the DATA_QUAL flag for all the reprocessed files, and for two of them (runs 320245802 and 578530168) I found a discrepancy (with respect to the mission FT2 file from FSSC)
In both cases the reprocessed file has DATA_QUAL=0, while the one from FSSC has DATA_QUAL=1.
Run 578530168 is the one that was fixed by M.E few days ago, for which there was a BTI that was not needed, so the flag was set back to 1.
Run 320245802 is one of the runs for which M.E. crated a new BTI (so the fact that the new file has DATA_QUAL=0 should be ok)
Tom Stephens
Looking at the database for 578630168, the production one has the updated BTI to set the value to 1, while the dev database doesn't have the latest entry and still has the data quality flag set to 0. So the test files would have the 0 value as you saw while the final reprocessed ones will have the correct value of 1. Thanks for checking those.
Don Horner
I compared the list of files generated by the data catalog query Maria Elena added to the page to what the FSSC has. For that time range, there are two early runs (from August 2008) in the list that we don't have files for at the FSSC: 240643862 and 240729801. Going back through old email, they are on a list from 2009 of bad runs that Anders Borgland said "should not be used for Physics". I assume that's why the FSSC doesn't have photon/event files for those runs either. We actually did get FT2 (but no FT1 files) for the two other runs on Anders' list: 243325118 and 243336617.
We could exclude all four runs from the reprocessing, but leaving them in wouldn't really be a problem. There just won't be any photon/event data for those times.
Maria Elena Monzani
Hi Don, thanks for cross-checking the list. It's fine to exclude those from reprocessing. Next step is to pick a cutoff date (probably when Tom and I are back from travel).
Don Horner
I feel like I should give an update since there hasn't been one posted for a while. I think we've gotten everything sorted out, and we'll probably start the reprocessing of the files in a few weeks after Maria Elena and Tom are back from various trips.
Don Horner
Tom Stephens
Just a status update.
We've verified that the FT2 files are good but it was pointed out that since we've changed the underlying GlastRelease we should verify some FT1 files as well. Getting the new L1Proc set up is in Maria Elena's court and she's currently out on medical leave, per an e-mail from Oct 24th she said:
"I'm currently on medical leave, but I'll take a look at the new L1Proc in a couple of weeks. I need to make some logic changes, which hopefully will make it easier to repipe entire runs. When that's ready, I'll give you guys a handful of FT1 files produced with the new GR/L1."
Once we've got that new L1Proc and those files to check, we'll get the reprocessing tasks installed in the PROD pipeline and get the reprocessing under way. That will probably not start until at least next week sometime.
Don Horner
We're getting close to releasing the updated files. The reprocessing has been done up to the beginning of March 4, 2020 (when it was the reprocessing was kicked off at the software week). I think Nicola has pretty much finished the pipeline update to incorporate the new IGRF model.
The reprocessed weekly files (30s and 1s) and mission long file have been put on the FSSC's FTP site in some temporary directories. You can take a look and do any checking you'd like:
Don Horner
For those not on the #software channel, Maria Elena posted an update
I’ve tested the updated GR and uploaded new reprocessing tasks. I’ve reprocessed the runs from 2020 with the updated IGRF model. @donhorner: would you like to re-import and check those files? Run range is: 598501886-604845703, including 1117 runs (interestingly enough, the IGRF code started failing on Dec 20, 2019, so I started from that date). Location is the usual one (files will have an increased version number). No BTIs were present for those runs, so I didn’t apply BTIs. I know there were a couple of questions about old BTIs, and I’ll look at those next. Once we decide that everything looks good, we should do a backfill of runs between March and July of this year.
I compared some of the runs to the previous versions from the initial reprocessing. The only differences were in the McIlwain B & L parameters and the invariant and effective geomagnetic latitudes, which I would expect given the updated IGRF model.
Alex ingested the new files into the data server. I've updated the reprocessed weekly files (30s and 1s) and mission long file on the FSSC's FTP site in the test directories. You can take a look and do any checking you'd like:
The IGRF model update only affects the week 603 through 613 files. I saved the older versions of those weekly files in a subdirectory called "old-igrf" under the weekly file directories if anyone wants to compare.
Don Horner
Since it looks like we're close to release. Here's a draft of an announcement to send out on fermi-news and put on the website news feed. Any comments welcome:
Robert Cameron
On the SC_VELOCITY point: should the comment include a statement that the SC__VELOCITY vector is in the same coordinate frame as the SC_POSITION?
On the corrected geodetic latitude and altitude: should the comment include a statement about the magnitude of the correction, and a link to more information about Ferrari's solution?
Don Horner
I can add a note about SC_POSITION.
The only information I have about the geodetic correction is from the JIRA issue. There's a link there to a wikipedia article at the Ferrari's solution that I could add.
Do you have anything specific you think we should say about the magnitude of the change to the exact solution? Maybe I can just state that the correction is small and not expected to impact analyses?
Don Horner
Here's an updated version:
Don Horner
Rob sent me some info about the difference between the geodetic calculations. Here's the latest version:
Maria Elena Monzani
Hey Don, your announcement looks good. I'm planning to switchover the pipeline tomorrow, 4/20 at around mid-day PST.
Don Horner
Fine with me.