Blog from May, 2009

Reason for change

The new version (v5r14p1, as opposed to v5r13p1) introduces some minor modifications to accommodate a new cal noisy channel (2414).

Test Procedure

We have processed monitoring products from real on-orbit data (LPA) locally with this version of AlarmsCfg.

Rollback procedure

The package can be rolled back to the previous version by flipping a soft link. Also note that the package is completely independent from any other package running in the pipeline and will not cause a version change of L1Proc.

CCB Jira

SSC-203@JIRA

Details (release notes for dataMonitoring/AlarmCfg for v5r14p1)

v5r14p1

  • CalXAdcPedRMS_LEX8_TH1: warning and error limits moved down to 30/30 for channel 2414.
  • CalXAdcPedRMS_LEX1_TH1: warning and error limits moved down to 3.6/3.6 for channel 2414.

v5r14p0

  • Minor changes to the alarm limits and exceptions to take care of the the cal channel 2414---see the e-mail thread on: http://www-glast.stanford.edu/protected/mail/datamon/2605.html
    The detailed changes are:
  • CalXAdcPedRMS_LEX8_TH1: warning and error limits moved from 10/15 to 40/40 for channel 2414.
  • CalXAdcPedRMS_LEX1_TH1: warning and error limits moved from 1.2/1.8 to 5/5 for channel 2414.
  • Added exceptions (status on violation will be clean) for channel 2414 on CalXAdcPedPedRMSDifference_LEX8_TH1, CalXAdcPedPedRMSDifference_LEX1_TH1, CalXAdcPedPedMeanDifference_LEX8_TH1, CalXAdcPedPedMeanDifference_LEX1_TH1.

Reason for change

Improved file server filter to decorate file systems.

CCB Jira

SSC-201@JIRA

Reason for change

During the last week, we have experienced a somewhat serious network problem on the connection between the MOC and SLAC. This problem was exacerbated by a few missed contacts due to ARRs and (probably) by a non-optimal TDRSS schedule due to the Space Shuttle mission. This resulted in several instances (at least one per day) of 'bunched-up' deliveries: 3 or 4 deliveries showing up at the same time for processing, which needed to be manually throttled into L1 processing (the pipeline requested 24/7 monitoring for all the last week).

We implemented an automatic throttling method for L1, which allows only a limited number of runs to start processing at the same time. This is implemented using a fixed number (3, for the time being) of global locks: for each run, the findChunks pre-empt script will try to allocate one of the available locks; if this succeeds, the run will start processing and the lock will be released by mergeMeritChunks (at approximately 70% of the processing). If no lock can be obtained (no throttling lock is available), the pre-empt script will wait 10 minutes and try again.

Test Procedure

We have processed data runs in the DEV pipeline with this mechanism, which proved to work as expected.
The overall performances of the system with throttling have not been tested: we don't expect the mechanism to introduce a general slowdown under normal operating conditions, but we will know this for sure only once it's implemented.
The 2 main throttling parameters (number of available locks and position at which the throttling lock is removed) can be adjusted modifying a config file (this doesn't require the upload of a new L1Proc).

Rollback procedure

We can switch back to the previous version of L1Proc.

CCB Jira

SSC-200@JIRA

Details

L1Pipeline v1r73:
- Adding an automatic throttling mechanism for L1Proc.

Reason for change

Telemetry Trending 24hours reports are currently taking 7 hours to be generated. The reason was found to be in the query to find the full history data points. Specifically in the part of the query to find the first point before the lower time edge out of the statistical data tables.

The reason has been changed to efficiently use the indexes on the statistical tables.

Test Procedure

This version has been tested on the DEV server.

Rollback procedure

Version 2.2.2 could be reinstalled

CCB Jira

SSC-198@JIRA