Blog from July, 2008

Reason for Change

See the Details section below. The most important of these changes is the pre-filtering of the FT1 files before merging.

Test Procedure

Tested by hand on ASP week 5 data (the impetus for ASP-38@JIRA). Tested in dev on /ASP/TestData2.

Rollback Procedure

Revert to ASP v2r8p1.

CCB JIRA

ssc-110@jira

Details

  • ASP-20@JIRA Archive "permanent" data on xrootd and cleanup ancillary files at end of processing
  • ASP-37@JIRA Use glastdataq for the getPgwInputData process in the PGWave task
  • ASP-38@JIRA apply zenith angle and CTBCLASSLEVEL cuts on individual FT1 files before merging to avoid excessively large merged files.
  • ASP-39@JIRA break up runpgw into smaller, atomic processes to ease partial roll backs
  • ASP v2r8p2
    • AspHealPix v0r0p1
    • AspLauncher v1r3p3
    • AspPolicy v0r6
    • BayesianBlocks v0r2
    • asp_pgwave v1r7p4*
    • drpMonitoring v1r4p4*
    • grbASP v4r5p4
    • pyASP v3r5p5*

Reasons for change

The most important modification listed below manages the application of the CTBCLASSLEVEL cuts using the configuration in the db tables. Some of data in LEO, including the most recent data, have been processed without the correct CTBCLASSLEVEL cuts being applied so that transient class events were being analyzed. This resulted in fluxes for the sources being over-estimated since source class irfs were assumed. This change will also make it easier to switch to diffuse class, should that be deemed appropriate.

Test Procedure

Tested in DEV on /ASP/TestData2.

Rollback Procedure

Revert to ASP v2r8 + ASP/grbASP v4r5p2 + ASP/drpMonitoring v1r4p1

CCB JIRA

ssc-109@jira

Details

  • ASP-33@JIRA Find robust way of specifying CTBCLASSLEVEL cuts in drpMonitoring
  • ASP-34@JIRA Handle leap seconds in converting GRB times in UTC to MET
  • ASP-35@JIRA grb refinement crashes in extractLatData if 1 photon is returned from the initial extraction
  • ASP-37@JIRA Use glastdataq for the getPgwInputData process in the PGWave task
  • Added MultiPartMailer.py module to handle email messages with image attachments (used by GRB_blind_search to send plots to BAs).
  • ASP v2r8p1
    • AspHealPix v0r0p1
    • AspLauncher v1r3p3
    • AspPolicy v0r6
    • BayesianBlocks v0r2
    • asp_pgwave v1r7p2*
    • drpMonitoring v1r4p2*
    • grbASP v4r5p4*
    • pyASP v3r5p4*

Reasons for change

GRB group wants the GRB name appearing on the data processing page to revert back to GRBYYMMDDFFF, where FFF is the fraction of the day x 1000 when the burst occurred. They also want the GCN_Notice_processor.py script to add this name to the subject line of the redistributed notices. Data processing requested a check in GRB_blind_search for out-of-order events.

Test Procedure

Tested by running by hand on recent GBM GCN notices. out-of-order event code tested by running on data from recent GRB search proc failures.

Rollback Procedure

Revert to ASP v2r8 + ASP/drpMonitoring v1r4p1

CCB JIRA

ssc-107@jira

Details

  • ASP-31@JIRA Add code to GRB_blind_search to check for out-of-order events in FT1 files
  • ASP-32@JIRA Use GRB name format GRBYYMMDDFFF for data processing page and for appending to subject line of GCN Notices.

Reason for change

We have a new GlastRelease that uses FSW B1-1-0 and which is compatible with the new DFI/evt file format. This release is needed to process data taken with the new FSW build that will be activated on Thursday July 31.

There are also general updates to the OBF filters to improve the agreement between the FSW and OBF Gamma-filters, a change to AcdRecon to catch G4 propagator problems, a change to the L1 job options that caused TKR calib crashes under some circumstances and a fix so that ACD tiles that are hit are displayed in FRED.

We have also re-enabled the "crash-in-case-of-Moot-key-mismatch" feature which was disabled in the Emergency request update to L1Proc 1.61.

There are also a lot of updates to the Data Monitoring and the Alarms, bug fixes and improvements to the L1 pipeline itself and a fix that will enable the Skimmer to ingest merged Root files produced by the L1 pipeline.

Test Procedure

FSW and Bryson provided evt files made with FSW 1-1-0 and in the new format so we could verify that GlastRelease could process the new data. Testing in the DEV pipeline has been made difficult by the fact that this data did not exist in Moot or in the Acquisition tables, did not have a corresponding Magic 7 file, dependet on ISOC_TEST etc. We have worked around most of these problems, for example by disabling Moot access for the testing. This means that we will have to make minor modifications to some of the scripts and job options before we put this L1Proc into PROD and that we have not been able to test everything together at the same time in the DEV pipeline. Usually we test the exact code.   

Rollback procedure

L1Proc can be NOT be rolled back to the previous version if we activate FSW B1-1-0 on the instrument.

CCB Jira

SSC-106@JIRA

Details

L1Pipeline v1r62
- Fixed the dependencies for copyM7Hp and registerM7Hp and copyM7Lci.
- Make chunkLists human-editable. LONE-93@JIRA
- Test for pathological chunks. LONE-92@JIRA
- Avoid core files in log directory from ingest jobs.
- Don't fail kludgeAsp if the stream has already been launched. ASP-26@JIRA
- Put runId and chunkId in target when logging missing files on merge. LONE-94@JIRA
- Trim a couple of static values from registerPrep/registerStuff interface.
- Require 50GB free scratch for merging recon.
- Performance/reliability tweaks. LONE-100@JIRA
- Change the way batch jobs are run so the ISOC environment can be chosen dynamically.
- In test mode only, acqQuery.runTimes returns bogus hardcoded values if run is not in ACQSUMMARY.

GlastRelease v15r33:
- Includes a working GCRCalib.
- Systemtestsfor GR v15r33with respect to the previous production release, GR v15r24. The differences are understood.
- RM diffwith respect to GR v15r24.

svac/EngineeringModelRoot: v4r1p7
- Added state variables for all the OBF filters and version number for the FSW Gamma-filter.

dataMonitoring/AlarmsCfg: v2r4p2
- Limits adjusted for digi trending. Relevant variables: Rate_AcdGemCNO_GARC, Mean_TkrTotalHitsPerEvt, OutF_NormalizedTkrTriggerWithLessThan6Layers_Tower, Rate_NTkrHits_TowerPlane, Mean_ConsecutivePlanesHit_Tower, DoubleDiffRate_LiveTimeFraction.
- New start_x parameter also used for the CAL LO trigger threshold plot.
- Recon eor improved taking advantage of the new start_x parameter of the alg__leftmost_edge algorithm (used for the alarms on the CAL LAC thresholds and the ACD veto thresholds, for which the threshold on the significance has also been changed from 60 to 20).
- Got rid of a couple of alarms on unstable recon quantities.
- Limits on the merit trending and recon eor updated.
- A few limits updated based on the new runs.
- A few limits updated based on the new runs.
- Got rid of the alarm on the rms of TkrHits_TH1_Tower (Digi).
- A few limits updated based on the new runs.
- Got rid of the alarm on the rms of TkrHits_TH1_Tower (FastMon).
- Added an exception for the hot strip in tower 3.

dataMonitoring/FastMonCfg: v1r4p4
- Add ToT plots per tkr tower layer controller. Relevant jira(s): GDQMQ-148
- Fixed all cuts to get rid of events with an error (error_summary<64) in config.xml. I choose not to propagate this cut to the configLCI since there should NOT be error in LCI runs, and if there is we want to spot them by any means. Relevant jira(s): GDQMQ-203
- Fixed all cuts to be explicitly booleans in config.xml and configLCI.xml. Relevant jira(s): GDQMQ-208
- Added AcdHitChannel to monitor the ACD occupancy per cable, channel in config.xml and configLCI.xml
- Corresponding plot name is AcdHitsCounter_CableChannel_TH2. Relevant jira(s): GDQMQ-1
- New error_summary variable added. Relevant jira(s): GDQMQ-173
- Move new_second and clocktics_dev_20MHz from all secondary xml files to baseConfig. Relevant jira(s): GDQMQ-196
- Added Z axis pointing in galactic coordinates L,B in baseConfig.xml. Relevant jira(s): GDQMQ-193

dataMonitoring/DigiReconCalMeritCfg: v1r2p10  
-  But corrected for quantity Rate_NTkrHits_TowerPlane. This addresses Jira GDQMQ-221
- Correct errors of some quantities: Jira GDQMQ-218.
- Add quantities to solve jira GDQMQ-214
- Add quantities to solve jira GDQMQ-200
- Solve issue described in Jira GDQMQ-207.
- New quantities (jiras GDQMQ-201 and GDQMQ-211) and bug correction (jira GDQMQ-213)
- FastMon trending of Z axis pointing in galactic coordinates. GDQMQ-198

dataMonitoring/Common: v3r7p0
- Possibility to select asymmetric fit ranges for the Cal Ped/Gain and the Acd Ped analyzers implemented.
- Add the correct error bars to the output histograms of the Cal Ped/Gain and the Acd Ped analyzers.
- Added the distribution of the mean, rms and reduced chisquare to the output root file from the Cal Ped/Gain and the Acd Ped analyzers.
- Minor improvements to the plots appearance for the Cal Ped/Gain and the Acd Ped analyzers.
- Added a switch to the Cal gain analyzer to get rid of the highest bin in the left/right plots (unphysical spikes). Relevant jira(s): GDQMQ-215
- New feature implemented in the leftmost_edge algorithm. Now the sliding window loop does not necessarily start from the first bin. A generic x value can be set from the xml configuration file. This improves the speed a lot and potentially makes the algorithms more stable. Relevant jira(s): GDQMQ-210
- Minor bug fix in alg__values and test function removed as it was broken.
- Exceptions taken into account when setting the output value in alg__spikes_and_holes
- Exception violations put back in. Relevant jira(s): GDQMQ-188
- Update to pGlobal, added the number of channels per ACD cable. Relevant jira(s): GDQMQ-1
- Minor improvement: errors not displayed for trending plots with no associated errors (used to look as +- 0.000).
- Minor aesthetics change in pAlarmBaseAlgorithm---the algorithm names were mispelled.
- Different way to associate the branch arrays to the relative errors in alg__values; now it's more general and gracefully handles Hiro's trackermon quantities---though it was not crashing before and gave the same number. It essentially suppresses a ROOT warning.
- Cast to float in pAlarmiLimits.getBadness() added to prevent numpy to return inf upon ZeroDivision error. Relevant jira(s): GDQMQ-206

dataMonitoring/FastMon: v3r5p6
- pErrorEvent : Move the TIMETONE errors to the lowest bits so that it is easier not to cut them out but cutting real errors applyingError_summary < 64 - GDQMQ-173
- pError : Reorganize error dictionnary to take into account new errors reported by the new version of the LDF. For now, I do not store the details of the errors reported by the different subsystems, besides the TEM_BUG. Relevant jira(s): GDQMQ-173, GDQMQ-197
- Minor improvements in the interface to the output of pErrorEvent.
- More informations added to the error summary xml and some internal improvements.
- pFastMonReportGenerator updated accordingly.
- Merging script modified to accomodate the changes.
- Bug fix in pSCPosition Ra axis pointing calculation, folding in 2Pi was not done correctly Relevant jira(s): GDQMQ-212
- Added a plot to monitor the ACD occupancy in electronic space AcdHitsCounter_CableChannel_TH2 give the occupancy per ACD cable, channel AcdHitChannel[12][18] is the associated tree variable. Relevant jira(s): GDQMQ-1
- New error_summary variable added (new FastMonCfg v1r4p0). Relevant jira(s): GDQMQ-173
- Cable tags added to the details of the phasing errors.
- Improved xml output for the error handler.
- Error documentation moved from the old text file to the python class file.
- Error summary parsing added.
- New error_summary variable added. Relevant jira(s): GDQMQ-197, GDQMQ-173
- Event iterators changed to detect instances of the TEM firmware bug and not to care of unphysical strips anymore. Relevant jira(s): GDQMQ-197

svac/Monitor: v1r2p16
- Modified function RFun::computeratio to check for channel saturation before usage. This addresses (partly) Jira GDQMQ-215.
- New quantities added to address issue from jiras GDQMQ-214
- New quantities added to address issues from jiras GDQMQ-201 and GDQMQ-211.
- 'FastMon trending of Z axis pointing in galactic coordinates - Jira GDQMQ-198.
- Change in treemerge.cxx to address Jira GDQMQ-191.

svac/TestReport: v6r15
- Abort is accepted as a valid closing reason instead of Pause. Jira FSW-1082

users/richard/pipelineDatasets: v0r5
- Build an index on ("m_runId", "m_eventId") after merge in pruneTuple::prune. LONE-105@JIRA

GPLtools: v1r11
- Clean up a corner-case crash in stagefiles cleanup, and try to avoid it in the first place. LONE-97@JIRA
- Try really hard to avoid creating directories that already exist. LONE-96@JIRA

Complete set of tags for L1Proc 1.62

Code Versions

GlastRelease (sim/recon) v15r33*

ScienceTools (Level 2) : v9r6p2

Science Ops (task defs, scripts):

Level 1 pipeline code and applications running in L1:

svac/L1Pipeline: v1r62*

calibTkrUtil v2r4p1
calibGenTKR v4r5

dataMonitoring/AlarmsCfg: v2r4p2*
dataMonitoring/FastMonCfg: v1r4p4*

dataMonitoring/DigiReconCalMeritCfg: v1r2p10*

dataMonitoring/Common: v3r7p0*
dataMonitoring/FastMon: v3r5p6*
datMonitoring/IGRF: v1r0p1

svac/Monitor: v1r2p16*
svac/EngineeringModelRoot: v4r1p7*
svac/TestReport: v6r15*

users/richard/pipelineDatasets: v0r5*

ft2Util: v1r2p23

evtClassDefs v0r6

GPLtools: v1r11*

Reason for Change

Point source db table has "duplicate" entries inserted by the PGWave detection steps. See JIRA issue in Details section.

Test Procedure

Run by hand using PROD tables. Run in full in dev on test data in /ASP/TestSims2

Rollback Procedure

Revert to ASP/drpMonitoring v1r4

CCB JIRA

ssc-105@jira

Details

Reasons for Change

See ASP JIRA issues in Details section below.

Test Procedure

Tested in dev on data in /ASP/TestData2.

Rollback Procedure

Revert to ASP v2r7p1

CCB JIRA

ssc-104@jira

Details

  • ASP-20@JIRA Archive "permanent" data on xrootd and cleanup ancillary files at end of processing
  • ASP-26@JIRA Have aspLauncher.sh give return code 160 if an already existing nDownlink value is passed.
  • ASP-27@JIRA remove healpix map creation from task
  • ASP-28@JIRA Aperture photometry cut was removing too many sources from the source list
  • ASP-29@JIRA Inspect contents of FT1/2 files to determine if GRB refinement task should be launched
  • ASP v2r8
    • AspHealPix v0r0p1
    • AspLauncher v1r3p3*
    • AspPolicy v0r6
    • BayesianBlocks v0r2
    • asp_pgwave v1r7p1*
    • drpMonitoring v1r4*
    • grbASP v4r5p1*
    • pyASP v3r5p3*
Request to deploy TelemetryTrending front end v2.1.0

Reasons for change

  • XML-RPC code changes:
    • Optimization of the network bandwidth usage in overlay mode
    • More friendly for the XML-RPC servers: split long requests into smaller ones
    • More fault-tolerant and efficient: send the XML-RPC requests to a pool of servers in parallel, instead of to a single server

Test Procedure

This version has been tested on the DEV server.

Rollback procedure

Version 2.0.3 could be reinstalled

CCB Jira

SSC-102@JIRA

Details

  • Code separation between Oracle and XML-RPC data retrieval methods
  • XML-RPC code changes:
    • Optimization of the network bandwidth usage in overlay mode: the mnemonics are grouped into the requests (IOT-101@JIRA)
    • More memory-friendly for the XML-RPC servers: the timespan of the individual requests cannot exceed 24 hours. Larger requests are permitted, but split into smaller ones (IOT-102@JIRA). There is no limit on the length of the timespan on the client side
    • Distributed requests: instead of using 1 single XML-RPC server, we send to a pool of servers (IOT-103@JIRA). This is more friendly for the servers, and more efficient because the requests can be sent in parallel (IOT-104@JIRA), with a controlled degree of concurrency defined per server (for now: if more than 1 request for a given server is in flight, the others are queued. This can be increased later)
type key summary assignee reporter priority status resolution created updated due

Unable to locate Jira server for this macro. It may be due to Application Link configuration.

L1Proc 1.61, which was put into production at 5pm PDT today, has the feature that it crashes in digitization if it detectes a mismatch between the LatC key in the data and the LatC key from the expected Moot key. The first delivery we got (80717001) all crashed with the message: 

MootSvc             FATAL Hw key 6680 in data does not match hw key belonging to Moot Config #2225
MootSvc             FATAL MootSvc exiting on fatal error..

Going back in the logfiles I can see that this has happened since we started to take nomSciOps_noSkirtCno_noCno runs after the calibOps runs, starting with

 Jul/15/2008 03:02:20.965 nomSciOps_noSkirtCno_noCno, run 237783738

Since we did not crash until today this was not detected i.e. we introduced a very useful feature in time! This also means that currently the L1 processing is down.

Talking to the Shift Coordinator it appears that the runs are not nomSciOps_noSkirtCno_noCno runs but plain old nomSciOps. The only differences are

1/ CNO is allowed to open the trigger window.
2/ The skirt tiles are in the CNO.

Looking at the monitoring for a few runs I see indications(!) for this. The deadtime seems to be slightly higher than it should be and the CNO arrival time has a peak at zero.

Note that the data products are completely correct. The only problem this causes (beyond the fact that we seem to be taking nomSciOps when we really wanted to take nomSciOps_noSkirtCno_noCno!) is that the runs are currently mislabeled in the web tables and in the Data Catalog. This can easily be fixed afterwards (we have already done this several times).   

Tomorrow there will be a PROC request to change the run type to what we really want. In the meantime we should continue processing the data as all the data products are correct. Richard, as head of the CCB allowed, us to hotpatch the L1 pipeline to remove the job option that is causing us to crash:

MootSvc.ExitOnFatal = true;

The job options in the pipeline are dynamically generated so the change is just to comment out this line in a pipeline script. 

This is now done! Two of the four crashed jobs are doing fine after having been rolled back. The two other ones were on a batch host that died on us. So we have to wait for the reaper before we can roll them back. But the L1 processing is up again! 

anders 

Reason for change

As more runs are processed we need to fine tune the settings (namely update the dataMonitoring/AlarmsCfg package) to prevent the alarms from firing due to non issues. The proposed package version is dataMonitoring/AlarmsCfg v2r3p4, as opposed to the v2r2p1 running originally in L1Proc 1.61.

The detailed description of the changes is in the last section.

Test Procedure

We have processed monitoring products from real on-orbit data (LPA) locally with this version of AlarmsCfg.

Rollback procedure

The package can be rolled back to the previous version by flipping a soft link. Also note that the package is completely independent from any other package running in the pipeline and will not cause a version change of L1Proc.

CCB Jira

SSC-100@JIRA

Details

FastMon end of run:

  • Limits on the number of triggers per tower (for all trigger primitives) made a bit wider as non-uniformities among centre vs. corner towers were causing warnings.
  • Limits on the average number of TkrHits per tower changed as the fluctuations among different orbits turned out to be bigger than anticipated, based on the very first runs. Alarm on the RMS of the same plot removed as the distribution has an exponential shape and it wasn't really adding any more information.
  • Exception added in order not to report the noisy strip in tower 3 as an issue.

Digi end of run:

  • Limits updated according to the changes in FastMon end of run (they all refer to quantities that are available both in FastMon and in Digi, so that they should have the same alarms with the same limits).

Digi trending:

  • Upper warning limit on the quantity OutF_Normalized_TkrHits_TowerPlane moved a bit as it was really too tight.

Recon end of run:

  • Got rid of the alarms on four vertex-related quantities as the corresponding distributions were too wide and dependent on the pointing vs. scanning mode to put sensible limits.
  • Minor adjustments of the limits on the following quantities: SuspCalHi_Highest0_EnergyDistribution_TH1 (leftmost_edge), ReconAcdPhaMips_PMTA_TH1_AcdTile (x_average and x_rms), ReconAcdGlobalPosNotMatchedTracks_RibbonYOriented_TH1_XYZ (gauss_rms)

Merit trending:

  • Changed the upper limits on the event rates for the three event classes. Those had been tuned before without taking into account the values when the earth limb is in the field of view, so they're considerably larger, now.

Reason for change

A couple of fixes:

  • Better graphical representation of upper limits in plots.
  • Redirection to login page when permissions are required

Test Procedure

This version has been tested on the DEV server.

Rollback procedure

Version 0.2.0 can be easily put back in place.

CCB Jira

SSC-99@JIRA

Details

type key summary assignee reporter priority status resolution created updated due

Unable to locate Jira server for this macro. It may be due to Application Link configuration.

Reason for change

C&A has requested that L1Proc upgrades to Science Tools v9r6p2 because of desired changes to makeFt1: "Modify makeFT1 so that it incorporates full source class information in CTBCLASSLEVEL column".

Hiro has requested that a new version of CalibTkrUtil is put into the L1 pipeline as soon as possible (improved MIP-selection, produce hot strip xml file when necessary).

We have also added a JO that makes digitization crash if it detects a mismatch between the LatC key in the data and the LatC key expected from the Moot key. This will make it easier to catch the case where the Mission Planning is not in synch with the real acqusitions. This will avoid that we resonstruct a run using the wrong configuration (affects FSW prescales and Gleam filters).

Because of the nature of the changes we do not expect any merging issues in PROD between L1Proc versions for partially processed runs.

Test Procedure

We have processed real on-orbit data (LPA and LCI) on the DEV server with this version of L1Proc. Jim Chiang has signed off the new FT1 file.

Rollback procedure

L1Proc can be rolled back to the previous version.

CCB Jira

SSC-98@JIRA

Details

L1Pipeline v1r61
- Fix bug that prevented registration of LCI data. LONE-90@JIRA
- Use MootSvc.ExitOnFatal="true"; when digitizing real data.
- Use makeFT1_kluge and Pass6_kluge_Classifier. LONE-91@JIRA

Science Tools v9r6p2:
- Modify makeFT1 so that it incorporates full source class information in CTBCLASSLEVEL column:  JIRA LONE-91.
- L1Proc should use Pass6_kluge_Classifier and makeFT1_kluge

CalibTkrUtil v2r4p1:
- Greatly improves MIP selection to make efficiency monitoring independent of LAT configuration. This version also incorporates new monitoring code that produces a hot strip xml files when necessary. 

calibGenTKR v4r5:
- Needed for CalibTkrUtil v2r4.

Job Options update for digitization:
- This will get into the standard Gleam job options in the next GR. For the time being we write it directly in the pipeline script.
- MootSvc.ExitOnFatal = true;

dataMonitoring/AlarmsCfg: v2r2p1
- Got rid of the alarm on the rms of TkrHits_TH1_Tower (FastMon)
- Updated tkr monitoring alarms as per Hiro's e-mail.
- Alarm limits on trackermon_trend quantities changed from percentages to fractions, following Hiro's convention.

Complete set of tags for L1Proc 1.61

Code Versions

GlastRelease (sim/recon) v15r24

ScienceTools (Level 2) : v9r6p2*

Science Ops (task defs, scripts):

Level 1 pipeline code and applications running in L1:

svac/L1Pipeline: v1r61*

calibTkrUtil v2r4p1*
calibGenTKR v4r5*

dataMonitoring/AlarmsCfg: v2r2p1*
dataMonitoring/FastMonCfg: v1r3p1

dataMonitoring/DigiReconCalMeritCfg: v1r2p3

dataMonitoring/Common: v3r4p1
dataMonitoring/FastMon: v3r4p2
datMonitoring/IGRF: v1r0p1

svac/Monitor: v1r2p11
svac/EngineeringModelRoot: v4r1p6
svac/TestReport: v6r15

users/richard/pipelineDatasets: v0r4

ft2Util: v1r2p23

evtClassDefs v0r6

GPLtools: v1r10

Reason for Changes

Bug in GRB_refinement plot generation for light curve when one or fewer photons returned. Forgot to update parameter names in call to gtexposure for
afterglow analysis. Minor improvements to drpMonitoring, asp_pgwave.

Test Procedure

Tested by hand on relevant data in prod. Run in dev on data in /ASP/TestData2.

Rollback Procedure

Revert to ASP v2r7.

CCB Jira

ssc-97@jira

Details

  • ASP-21@jira light curve plotting fails in GRB_refinement if only one or zero events are returned for a time/region selection
  • ASP-22@jira update gtexposure parameter names in grbASP/afterglowAnalysis.py
  • ASP-23@jira Modify light curve FITS file according to David's comments
  • ASP-24@jira modify runsrcid script to send Flare Advocate emails to full list only when running in prod mode
  • ASP v2r7p1
    • AspHealPix v0r0p1
    • AspLauncher v1r3p2
    • AspPolicy v0r6
    • BayesianBlocks v0r2
    • asp_pgwave v1r6p2*
    • drpMonitoring v1r3p4*
    • grbASP v4r4p2*
    • pyASP v3r5p2
    • sourceDet v0r0p1

Reason for change

The main reason is to update the ingestion script to allow the ingestion of new quantities.

The front-end has small improvements and a bug fix to a null pointer exception.

Please notice that this is linked to SSC-94@JIRA and should happen at the same time.

Test Procedure

This version has been tested on the DEV server.

Rollback procedure

Version 1.0.4 can be easily put back in place.

CCB Jira

SSC-96@JIRA

Details

type key summary assignee reporter priority status resolution created updated due

Unable to locate Jira server for this macro. It may be due to Application Link configuration.

Reason for change

The reason for the change is twofold. In this new version we have decoupled the alarm settings from the version of L1Proc. While they will still be CCB controlled we will be able to much more quickly update them and without having to make a new version of L1Proc. The alarm settings have also been updated to better reflect the new configuration which now is default. We are also fixing a dependency issue with one of the modules running in L1Proc (Verify - a data consistency check module). Currently it quite often crashes and has to be rolled back. The current data processing situation is complex enough that it's worth while reducing unnecessary noise for the shifters whenever possible. It also reduces the load on the L1 pipeline experts.

Becaue of the nature of the changes we do not expect any merging issues in PROD between L1Proc versions for partially processed runs.

Test Procedure

We have processed real on-orbit data on the DEV server with this version of L1Proc.

Rollback procedure

L1Proc can be rolled back to the previous version.

CCB Jira

SSC-94@JIRA

Details

L1Pipeline v1r60
- Fixing the dependency of the Verify module (sometimes it was executed before the merged digi file was registered, and couldn't get the correct version information from the Data Catalog).
- Use a link to point to AlarmsCfg LONE-88@JIRA.
- Use error merger provided by FastMon instead of our own LONE-70@JIRA.
- Run alarms on merit histograms and trending LONE-87@JIRA.
- Remove redundant dependence from registerM7L1.

dataMonitoring/AlarmsCfg: v2r1p1
- New variables added for the distributions of the alarms output. GDM-27@JIRA
- New alarm on the CAL lac threshold, and a couple of obsolete ones removed. GDQMQ-205@JIRA
- Alarms back on the 2D cal trigger plots (leftmost_edge_slices).
- Digi eor and trend limits and settings reviewed after the change in the default configuration.
- Minor change to the recon eor configuration file.
- Recon eor limits and settings reviewed after the change in the default configuration.
- Wrong sign in the digi eor limits changed.
- Updated the exception file for the alarms on the digi trending to take into account the broken GTRC in tower 10, plane 0.

dataMonitoring/Common: v3r4p1
- Minor bug fix.
- New alg__leftmost_edge_slices algorithm.
- Minor bug fix in alg__values.
- alg__x_min_bin_slices improved taking advantage of the improvements in the base class since v3r3p0. Relevant jira(s): GDQMQ-52
- Threshold for the minimum TrueTimeInterval in alg__values changed from 1s to 5 s.
- Explanatory label added to the alarm handler xml output.
- Check on the TrueTimeInterval implemented to prevent the alarms on the trending quantities from firing on tiny time bins.
- Fix for the DoubleDiffRate, which do not have associated errors in the tree.
- One more zero division error bug fix. And yes, different from the previous tag. Hopefully the last one... Relevant jira(s): GDQMQ-204
- One more zero division error bug fix. Relevant jira(s): GDQMQ-204
- Bug fix in the spikes_and_holes alarm algorithm. Relevant jira(s): GDQMQ-202
- Error bars on the trending plots should be now correctly handled.
- More improvements in the alarm outputs. Relevant jira(s): GDQMQ-147
- Got the timestamp right for the alarms on the trending quantities.
- Improved formatting of error details.
- Exception mechanism implemented for the alarms on the trending quantities. Also a partial bug fix related to the exceptions themselves. Relevant jira(s): GDQMQ-188 

svac/TestReport: v6r15
- Abort is accepted as a valid closing reason for the run. Relevant jira(s): FSW-1078.

Complete set of tags for L1Proc 1.60

Code Versions

GlastRelease (sim/recon) v15r24

ScienceTools (Level 2) : v9r5p5

Science Ops (task defs, scripts):

Level 1 pipeline code and applications running in L1:

svac/L1Pipeline: v1r60*

calibTkrUtil v2r2p4 

dataMonitoring/AlarmsCfg: v2r1p1*
dataMonitoring/FastMonCfg: v1r3p1

dataMonitoring/DigiReconCalMeritCfg: v1r2p3

dataMonitoring/Common: v3r4p1*
dataMonitoring/FastMon: v3r4p2
datMonitoring/IGRF: v1r0p1

svac/Monitor: v1r2p11
svac/EngineeringModelRoot: v4r1p6
svac/TestReport: v6r15*

users/richard/pipelineDatasets: v0r4

ft2Util: v1r2p23

evtClassDefs v0r6

GPLtools: v1r10

Reasons for Change

  • Flux estimates were not being calculated for all of the sources found by asp_pgwave and reported in the ASP Data Viewer page. Gino implemented a fix and improved the light curve calculations for variable source detection.
  • We need to move to ST v9r6p1 in order to use a more modern version of pointfit that provides sensible TS estimates for the post-filtering of the pgwave sources.
  • Data catalog queries only return the most recent version of FT1/2 for a run and L1Proc has (had?) not been producing consistent valid FT1 and FT2 files for every run, so the files returned by the query may be out-of-sync.
  • Minor fix in grbASP to sort the input FT1 files to ensure event time ordering.

Test Procedure

Tested in dev using test data in /ASP/TestData2. Tested by hand on some of the previously problematic time intervals that had inconsistent FT1/2 files from a query made in prod.

Rollback Procedure

Revert to ASP v2r6 + ASP/grbASP v4r4

CCB JIRA

ssc-95@jira

Details

  • ASP-5@JIRA need to check that FT1/2 files are in sync
  • ASP-9@JIRA pointfit removes too many sources found by pgwave
  • ASP-16@JIRA PGWave is not computing flux estimates for all of the sources it claims to find
  • ASP v2r6
    • AspHealPix v0r0p1
    • AspLauncher v1r3p1*
    • AspPolicy v0r6
    • BayesianBlocks v0r2
    • asp_pgwave v1r6p1*
    • drpMonitoring v1r3p3*
    • grbASP v4r4p1*
    • pyASP v3r5p2
    • sourceDet v0r0p1