status: In Progress
last update: 21 Feb 2014
*** PAGE IN PROGRESS: Not yet ready for public consumption ***
"New generation" tasks (using SCons builds, rewritten task scripts, common python scripts, etc.)
- P203-FITS - This task reads MERIT and produces FT1 (photons) + EXTENDEDFT1
This task is identical with P202-FITS with the following exceptions:
- The dataset reprocessed is extended to include all current (Level 1) data since the end of the P202 task.
- New diffuse response
- Only FT1 and EXTENDEDFT1 data products are produced
- File naming and other bookkeeping uses "P203" rather than "P202", where appropriate
Bookkeeping
- (This page): Define ingredients of reprocessing (processing code/configuration changes)
- Processing History database: http://glast-ground.slac.stanford.edu/HistoryProcessing/HProcessingRuns.jsp?processingname=P203
- List of all reprocessings
- List of all data runs reprocessed
- Pointers to all input data files (-> dataCatalog)
- Pointers to associated task processes (-> Pipeline II status)
- Data Catalog database: http://glast-ground.slac.stanford.edu/DataCatalog/folder.jsp
- Lists of and pointers to all output data files
- Meta data associated with each output data product
P203-FITS
This task generates FT1 and EXTENDEDFT1 data products.
Status chronology
- 2/20/2014 - Initial set-up of task
- 2/25/2014 - Begin trickleStream on Block 1 – entire P202 run list
- 2/28/2014 - Many problems: LSF seems to bunch up a large number of jobs before dispatching. This causes a shock wave of jobs causing a problem for AFS (reading the ~500 MB diffuse response file), and for the dataCatalog (the queries for FT2 file, and to determine the next file version for each data product). Therefore, reconfigure trickleStream to a mere dribble – no more than 300-400 jobs typically running simultaneously. This means a 1-day task has become a week-long effort.
- 3/3/2014 - Block 1 complete. Last week developed new scheme to move all dataCatalog queries from batch jobs into jython scriptlets. These changes will be integrated prior to processing any more data. Validity check: There are the same number of runs and the same number of events in the reprocessed P203 data as in the P202 data.
310,326,817 events in 29158 files. The run range for block 1 (== entire range of P202) is 239557414 through 405329691. - 3/4/2014 - Configure Block 2, consisting of Level 1 data since 5 Nov 2013 through 28 Feb 2014.
#runs 30915 #evts 67249594218 start 239557417 2008-08-04 15:43:37 stop 415328595 2014-03-01 01:03:15
Configuration
Identical with P202-FITS except:
Diffuse Model | based on contents of /afs/slac.stanford.edu/g/glast/ground/GLAST_EXT/diffuseModels/v4r1 |
Output Data Products |
Generation of output data products:
Data Product | destination | data content [1] | event selection [1] | makeFT1 | gtselect | gtdiffrsp | gtmktime |
---|---|---|---|---|---|---|---|
EXTENDEDFT1 | SLAC | FT1variables | ((FT1EventClass & 0x00003EFF)!=0) | ||||
FT1 | FSSC+SLAC | FT1variables | 'source' and above | (inherited) |
Note that diffuse response is calculated for 'source' and 'clean' event classes only.
Timing
- Each run requires approx 20-30 minutes of CPU time, depending on the machine-class being used. However, due to AFS and dataCatalog issues, block 1 running was restricted to ~500 or fewer jobs at a time.