Blog from October, 2008

Reason for change

We have fine tuned some alarm limits after the first runs taken with the new configuration (Moot key 2450). The proposed version of the dataMonitoring/AlarmsCfg package is v5r3p1, as opposed to v4r10p2, previously running in L1Proc 1.68.

The detailed description of the changes is in the last section.

Test Procedure

We have processed monitoring products from real on-orbit data (LPA) locally with this version of AlarmsCfg.

Rollback procedure

The package can be rolled back to the previous version by flipping a soft link. Also note that the package is completely independent from any other package running in the pipeline and will not cause a version change of L1Proc.

CCB Jira

SSC-155@JIRA

Details

v5r3p1

  • Disabled all the new alarms on the timing in fastmon and digi eor, since they still need tuning and reference histograms.

v5r3p0

  • Some more fine tuning of the alarm limits based on the first data taken in the new configuration (Moot key 2450).
  • Limits on AcdPedPedRMSDifference* in acdpedsanalyzer relaxed.
  • Limits on RPM_GainMeanDifference_TH1 in calgainsanalyzer relaxed.
  • One exception removed from the trackermon (hot strip masked).
  • Added exception for digi trend, Mean_TkrHitsPerEvt_TowerPlane and OutF_Normalized_TkrHits_TowerPlane, tower 0 GTFE 14.
  • Limits on calpedsanalyzer CalXAdcPedPedRMSDifference_LEX1_TH1 relaxed.
  • Limits on CalX_Total_NHit_TH1 (low_high_ratio) in digi and fastmon eor relaxed a little bit.
  • Limits on digi eor TkrPlanesHit_TH1_Tower put separately on tower 0, for which we have a noisy strip deforming the distribution.

v5r2p1

  • Some limits on the acd peds relaxed.
  • Relevant jira(s): GDQMQ-279.

v5r2p0

  • All the y-values alarms on digi and fastmon eor updated to the new trigger window width (14 ticks instead of 12).
  • Limits relaxed on some of the timing quantities in digi and fastmon eor.

v5r1p0

  • Channel 1983 added as an exception on the the LEX1 and LEX8 pedestal rms values (cal_eor).
  • Warning max for CalXAdcPedRMS_HEX8_TH1 (y_values) in calpeds_eor moved from 3.7 to 4.0 (the margin wasn't large enough).
  • Warning min for CalXAdcPedPedRMSDifference_HEX8_TH1 (y_values) in calpeds_eor moved from 0.25 to 0.15.
  • Warning min for CalXAdcPedPedRMSDifference_LEX8_TH1 (y_values) in calpeds_eor moved from -1.0 to -1.5.
  • Warning min for RPM_RMS_TH1 (y_values) in calgains_eor moved from 0.15 to 0.10.
  • Relevant jira(s): GDQMQ-279

v5r0p1

  • Configuration file for the error logger on the verify module implemented.
  • Relevant jira(s): GDQMQ-250

v5r0p0

  • Implement alarms on timing quantities, mostly Condition Arrival Times, in both fastmon and digi. Settings will have to be tuned when the new timing configuration in uploaded.
  • First use of the reference histograms for alarms
  • Add an alarm on CondSummaryWord using the reference histogram, as a test
  • Relevant jira(s): GDQMQ-271

v4r11p0

  • Configuration file for the pErrorLogger on the FastMon fully implemented.
  • Relevant jira(s): GDQMQ-250

Reason for change

The main reason for change is that we now have in place all the alarms on ACD/CAL monitoring products we agreed with the subsystem experts. The proposed version of the dataMonitoring/AlarmsCfg package is v4r10p2, as opposed to v4r7p5, originally running in L1Proc 1.68.

The detailed description of the changes is in the last section.

Test Procedure

We have processed monitoring products from real on-orbit data (LPA) locally with this version of AlarmsCfg.

Rollback procedure

The package can be rolled back to the previous version by flipping a soft link. Also note that the package is completely independent from any other package running in the pipeline and will not cause a version change of L1Proc.

CCB Jira

SSC-154@JIRA

Details

v4r10p2

  • Alarm limits for the alarms added in tags v4r10p0 and v4r10p1 relaxed a little bit.

v4r10p1

  • New alarms on fastMon eor CalX_Total_NHit_TH1 and CalX_NHit_TH1_Tower_*.
  • Relevant jira(s): GDQMQ-230

v4r10p0

  • New alarms on digi eor CalX_Total_NHit_TH1 and CalX_NHit_TH1_Tower_*.
  • Relevant jira(s): GDQMQ-230

v4r9p3

  • Alarms tuned on CalGainsAnalyzer.
  • Relevant jira(s): GDQMQ-279

v4r9p2

  • Alarms tuned on CalPedsAnalyzer.
  • Relevant jira(s): GDQMQ-279

v4r9p1

  • Alarms tuned on calHist.
  • Relevant jira(s): GDQMQ-279

v4r9p0

  • Alarms tuned on AcdPedsAnalyzer.
  • Relevant jira(s): GDQMQ-279

v4r8p1

  • Put an alarm on digi eor Delta_CCSDSTime_EvtTime_Epu0_TH1 and Delta_CCSDSTime_EvtTime_Epu1_TH1.
  • Relevant jira(s): GDQMQ-280

v4r8p0

  • Low limits on digi trend/Rate_AcdGemCNO_GARC lowered to 0.3/0.5.
  • Alarm added on digi trend OutF_Ratio_EvtSize_CompressedEvtSize quantity.
  • Relevant jira(s): GDQMQ-276

Reason for Change

See JIRA items in Details section.

Test Procedure

Tested in dev on data in /ASP/TestSims2. Also tested in prod for the cases where the problems arose, i.e., for ASP-52, ASP-53.

Rollback Procedure

Revert to ASP v2r10p1.

CCB JIRA

ssc-151@jira

Details

  • ASP-52@JIRA The test for duplicate stream exceptions in aspLauncher.py is not correct. This issue affects rollbacks of the klugeASP process by L1Proc.
  • ASP-53@JIRA use of the 1 sec FT2 file in L1Proc for gtmktime introduces GTIs with zero events after zenith angle cutting. This affects only the GRB BlindSearch.
  • ASP-54@JIRA modify query in PGWave from pointsources table to restrict to DRP, blazar group, or PGWAVE sources. The PGWAVE task did not properly account for the introduction of non-ASP sources in the db tables.
  • ASP v2r10p2
    • AspHealPix v0r0p1
    • AspLauncher v1r3p6*
    • AspPolicy v0r6p1
    • BayesianBlocks v0r2
    • asp_pgwave v1r8p3*
    • drpMonitoring v1r6p3
    • grbASP v4r7p1*
    • pyASP v3r5p6
Request to deploy TelemetryTrending front end v2.1.1

Reasons for change

  • XML-RPC code changes:
    • Using the much simpler "version 1" of the xml-rpc getTrending protocol
    • Allow up to 4 simultaneous requests per server because the xml-rpc server is now fully multi-process
    • Limit the timespan of a web trending request to 7 days (instead of 24hr)

Test Procedure

This version has been tested on the DEV server: http://glast-tomcat03:8080/TelemetryTrending

Rollback procedure

Version 2.1.0 could be reinstalled

CCB Jira

SSC-148@JIRA

Details

  • updated the code in DataChannel.java to un-marshall the trending data according to the "version 1" of the getTrending xml-rpc protocol, as described in DATA_FORMATS.txt
  • changed the MAX_CONTENTION constant to 4, to allow 4 simultaneouos rpc to the same server. This is because the servers are now multi-process, 1 process per CPU core. The machines being quad-core, we have 4 server processes on each
  • changed the SPLIT_REQSPAN_MS constant to -1, to disable splitting large requests into smaller sub-requests sent to different servers. This is because this is done locally on each server, to take advantage of the multi-core, and we want to maximize cache usage on each server (don't distribute the data too much)
  • changed the MAX_AUTO_REQSPAN_MS constant to 7 days, to allow users to request trending graphs over a larger period.
    type key summary assignee reporter priority status resolution created updated due

    Unable to locate Jira server for this macro. It may be due to Application Link configuration.

Reasons for Change

  • AGN group requests that DRP_monitoring be run on six hour time scale.
  • Transient class events have too high backgrounds to be used to for GRB afterglow analysis, so we switch to diffuse class events.

Test Procedure

Tested on dev on data in /ASP/TestSims2

Rollback Procedure

Revert to ASP v2r10

CCB JIRA

ssc-147@jira

Details

  • ASP-50@JIRA use diffuse class events for afterglow analysis
  • ASP-51@JIRA Run DRP_monitoring on the six hour time scales
  • ASP v2r10p1
    • AspHealPix v0r0p1
    • AspLauncher v1r3p5*
    • AspPolicy v0r6p1
    • BayesianBlocks v0r2
    • asp_pgwave v1r8p2*
    • drpMonitoring v1r6p3
    • grbASP v4r7*
    • pyASP v3r5p6
Request to deploy ASP v2r10

Reason for Change

See the three JIRAs below.

Test Procedure

Tested on dev using data in /ASP/TestSims2.

Rollback Procedure

Revert to ASP v2r9p1.

CCB JIRA

ssc-146@jira

Details

  • ASP-46@JIRA add code to skip/expire analyses of GRB sources after a certain time interval
  • ASP-47@JIRA fill spectral index information in lightcurves table
  • ASP-48@JIRA delay PGWave launching until after the end of the pending interval
  • ASP v2r10
    • AspHealPix v0r0p1
    • AspLauncher v1r3p4*
    • AspPolicy v0r6p1
    • BayesianBlocks v0r2
    • asp_pgwave v1r8p1
    • drpMonitoring v1r6p3*
    • grbASP v4r6p1*
    • pyASP v3r5p6