You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 11 Next »

status: In Progress
last update: 21 Feb 2014

*** PAGE IN PROGRESS: Not yet ready for public consumption ***

"New generation" tasks (using SCons builds, rewritten task scripts, common python scripts, etc.)
  • P203-FITS - This task reads MERIT and produces FT1 (photons) + EXTENDEDFT1


This task is identical with P202-FITS with the following exceptions:

  1. The dataset reprocessed is extended to include all current (Level 1) data since the end of the P202 task.
  2. New diffuse response
  3. Only FT1 and EXTENDEDFT1 data products are produced
  4. File naming and other bookkeeping uses "P203" rather than "P202", where appropriate


Bookkeeping

  1. (This page): Define ingredients of reprocessing (processing code/configuration changes)
  2. Processing History database: http://glast-ground.slac.stanford.edu/HistoryProcessing/HProcessingRuns.jsp?processingname=P203
    1. List of all reprocessings
    2. List of all data runs reprocessed
    3. Pointers to all input data files (-> dataCatalog)
    4. Pointers to associated task processes (-> Pipeline II status)
  3. Data Catalog database: http://glast-ground.slac.stanford.edu/DataCatalog/folder.jsp
    1. Lists of and pointers to all output data files
    2. Meta data associated with each output data product

P203-FITS

This task generates FT1 and EXTENDEDFT1 data products.

Status chronology

  • 2/20/2014 - Initial set-up of task
  • 2/25/2014 - Begin trickleStream on Block 1 – entire P202 run list
  • 2/28/2014 - Many problems: LSF seems to bunch up a large number of jobs before dispatching.  This causes a shock wave of jobs causing a problem for AFS (reading the ~500 MB diffuse response file), and for the dataCatalog (the queries for FT2 file, and to determine the next file version for each data product).  Therefore, reconfigure trickleStream to a mere dribble – no more than 300-400 jobs typically running simultaneously.  This means a 1-day task has become a week-long effort.
  • 3/3/2014 - Block 1 complete.  Last week developed new scheme to move all dataCatalog queries from batch jobs into jython scriptlets.  These changes will be integrated prior to processing any more data.  Validity check: There are the same number of runs and the same number of events in the reprocessed P203 data as in the P202 data.
    310,326,817 events in 29158 files.  The run range for block 1 (== entire range of P202) is 239557414 through 405329691.
  • 3/4/2014 - Configure Block 2, consisting of Level 1 data since 5 Nov 2013 through 28 Feb 2014.  Begin trickleStream.

    #runs30915 
    #evts67249594218 
    start2395574172008-08-04 15:43:37
    stop4153285952014-03-01 01:03:15


Configuration

 Identical with P202-FITS except:

Diffuse Model

based on contents of /afs/slac.stanford.edu/g/glast/ground/GLAST_EXT/diffuseModels/v4r1
(see https://confluence.slac.stanford.edu/display/SCIGRPS/Quick+Start+with+Pass+7)

Output Data Products

FT1, EXTENDEDFT1

Generation of output data products:

Data Product

destination

data content [1]

event selection [1]

makeFT1

gtselect

gtdiffrsp

gtmktime

EXTENDEDFT1

SLAC

FT1variables

((FT1EventClass & 0x00003EFF)!=0)
pass7.6_Extended_cuts_L1

(tick)

(error)

(tick)

(tick)

FT1

FSSC+SLAC

FT1variables

'source' and above
EVENT_CLASS bits 2,3,4
evclass=2 filtered from EXTENDEDFT1

(error)

(tick)

(inherited)

(tick)

 

Note that diffuse response is calculated for 'source' and 'clean' event classes only.

Timing

  •  Each run requires approx 20-30 minutes of CPU time, depending on the machine-class being used.  However, due to AFS and dataCatalog issues, block 1 running was restricted to ~500 or fewer jobs at a time.

DataCatalog query change (2/28/2014)

1) Refer to the modified files already in the DCtest task on /u38 which was used to prototype this change

2) update repTools.py with new version of getCurrentVersion(), and make a completely
new release 00-01-05.  Note that findFt2() is now obsolete.

3) In the P203-FITS/config directory, make these changes:

>> config.py - change pointer to new version of commonTools
>> setupRun.py - prepare list of output data product types
>> createClumps.jy - query for FT2 file name and store in pipeline var
>> processClump.py - fetch FT2 file name directly from pipeline var rather than via query
>> setupMerge.jy - query for latest file version of each output data type
>> mergeClumps.py - make pipeline vars -> env-vars

4) The usual git commit/tag/push

 

 

 

 

 


  • No labels