status: In Progress
last update: 21 Feb 2014

*** PAGE IN PROGRESS: Not yet ready for public consumption ***

"New generation" tasks (using SCons builds, rewritten task scripts, common python scripts, etc.)

P203-FITS - This task reads MERIT and produces FT1 (photons) + EXTENDEDFT1

This task is identical with P202-FITS with the following exceptions:

The dataset reprocessed is extended to include all current (Level 1) data since the end of the P202 task.
New diffuse response
Only FT1 and EXTENDEDFT1 data products are produced
File naming and other bookkeeping uses "P203" rather than "P202", where appropriate

Bookkeeping

(This page): Define ingredients of reprocessing (processing code/configuration changes)
Processing History database: http://glast-ground.slac.stanford.edu/HistoryProcessing/HProcessingRuns.jsp?processingname=P203
1. List of all reprocessings
2. List of all data runs reprocessed
3. Pointers to all input data files (-> dataCatalog)
4. Pointers to associated task processes (-> Pipeline II status)
Data Catalog database: http://glast-ground.slac.stanford.edu/DataCatalog/folder.jsp
1. Lists of and pointers to all output data files
2. Meta data associated with each output data product

P203-FITS

This task generates FT1 and EXTENDEDFT1 data products.

Status chronology

2/20/2014 - Initial set-up of task
2/25/2014 - Begin trickleStream on Block 1 – entire P202 run list
2/28/2014 - Many problems: LSF seems to bunch up a large number of jobs before dispatching. This causes a shock wave of jobs causing a problem for AFS (reading the ~500 MB diffuse response file), and for the dataCatalog (the queries for FT2 file, and to determine the next file version for each data product). Therefore, reconfigure trickleStream to a mere dribble – no more than 300-400 jobs typically running simultaneously. This means a 1-day task has become a week-long effort.
3/3/2014 - Block 1 complete. Last week developed new scheme to move all dataCatalog queries from batch jobs into jython scriptlets. These changes will be integrated prior to processing any more data. Validity check: There are the same number of runs and the same number of events in the reprocessed P203 data as in the P202 data.
310,326,817 events in 29158 files. The run range for block 1 (== entire range of P202) is 239557414 through 405329691.
3/4/2014 - Configure Block 2, consisting of Level 1 data since 5 Nov 2013 through 28 Feb 2014. Begin trickleStream.
#runs 30915
#evts 67249594218
start 239557417 2008-08-04 15:43:37
stop 415328595 2014-03-01 01:03:15

Configuration

Identical with P202-FITS except:

Diffuse Model	based on contents of /afs/slac.stanford.edu/g/glast/ground/GLAST_EXT/diffuseModels/v4r1 (see https://confluence.slac.stanford.edu/display/SCIGRPS/Quick+Start+with+Pass+7)
Output Data Products	FT1, EXTENDEDFT1

Generation of output data products:

Data Product	destination	data content [1]	event selection [1]	makeFT1	gtselect	gtdiffrsp	gtmktime
EXTENDEDFT1	SLAC	FT1variables	((FT1EventClass & 0x00003EFF)!=0) pass7.6_Extended_cuts_L1
FT1	FSSC+SLAC	FT1variables	'source' and above EVENT_CLASS bits 2,3,4 evclass=2 filtered from EXTENDEDFT1			(inherited)

Note that diffuse response is calculated for 'source' and 'clean' event classes only.

Timing

Each run requires approx 20-30 minutes of CPU time, depending on the machine-class being used. However, due to AFS and dataCatalog issues, block 1 running was restricted to ~500 or fewer jobs at a time.

DataCatalog query change (2/28/2014)

1) Refer to the modified files already in the DCtest task on /u38 which was used to prototype this change

2) update repTools.py with new version of getCurrentVersion(), and make a completely
new release 00-01-05. Note that findFt2() is now obsolete.

3) In the P203-FITS/config directory, make these changes:

>> config.py - change pointer to new version of commonTools
>> setupRun.py - prepare list of output data product types
>> createClumps.jy - query for FT2 file name and store in pipeline var
>> processClump.py - fetch FT2 file name directly from pipeline var rather than via query
>> setupMerge.jy - query for latest file version of each output data type
>> mergeClumps.py - make pipeline vars -> env-vars

4) The usual git commit/tag/push

Space shortcuts

Child pages

"New generation" tasks (using SCons builds, rewritten task scripts, common python scripts, etc.)

P203-FITS

Status chronology

DataCatalog query change (2/28/2014)

#runs	30915
#evts	67249594218
start	239557417	2008-08-04 15:43:37
stop	415328595	2014-03-01 01:03:15

Space shortcuts

Child pages

P203 - Updated diffuse response for P7REP

"New generation" tasks (using SCons builds, rewritten task scripts, common python scripts, etc.)

P203-FITS

Status chronology

DataCatalog query change (2/28/2014)