status: Under Construction
last update: 15 February 2012
This page is the beginning of a description of a full reprocessing based on Pass 7 analysis.
A 'full' Pass 7 reprocessing is currently envisioned to begin ~August 2011. This task will read DIGI files and emit RECON, MERIT, a selection of other ROOT files, and an array of FITS files. It will be a CPU-intensive and storage-intensive enterprise requiring months of elapsed time and somewhere around 3/4 Pbytes of storage. At the time of this task beginning, there will be about 16,000 science runs in Fermi (3 years accumulation).
In order to avoid occupying a new 750 TB of disk space, the plan is to remove old RECON files once they have been reprocessed. This is a shell game that involves some amount of buffer space and then waiting until the new RECON file has been created and (to some extent) validated before removal. Further, the old RECON file must first be backed up to HPSS. A straw-man proposal for this scheme follows.
Assume 50 TB of xroot buffer space.
Reprocess ~10% of the mission data. Once complete, prepare list of files to be removed (RECON, possibly plus others like CAL and SVAC if they are also regenerated by the reprocessing task) and hand off to Wilko. He will ensure the files to be deleted have been backed up and then delete the disk version.
This page is a record of the configuration for the P202 reprocessing project, full reprocessing from DIGIs using Pass7 analysis code. This project involves reprocessing with Pass7 classification trees and new IRFs. The name "P202" derives from the word "processing" and the initial file version to be used for the output data products, e.g., r0123456789_v202_merit.root.
"New style" tasks (using SCons, OO and common scripts, etc.)
- P202-ROOT - This task reads DIGI and produces reprocessed RECON+CAL+GCR+MERIT+FILTEREDMERIT (photons)+ELECTRONMERIT
- P202-FITS - This task reads MERIT and produces FT1 (photons) + EXTENDEDFT1 + LS1 (merit-like FITS file for photons) + EXTENDEDLS1 + ELECTRONFITS file
Datafile names, versions and locations
Data file version numbers for this reprocessing will begin with v202.
XROOT location and file naming
Location template:
/glast/Data/Flight/Reprocess/<reprocessName>/<dataType>
Locations for P202:
/glast/Data/Flight/Reprocess/P202/recon /glast/Data/Flight/Reprocess/P202/cal /glast/Data/Flight/Reprocess/P202/gcr /glast/Data/Flight/Reprocess/P202/merit /glast/Data/Flight/Reprocess/P202/filteredmerit /glast/Data/Flight/Reprocess/P202/electronmerit /glast/Data/Flight/Reprocess/P202/ft1 /glast/Data/Flight/Reprocess/P202/extendedft1 /glast/Data/Flight/Reprocess/P202/electronft1 /glast/Data/Flight/Reprocess/P202/ls1 /glast/Data/Flight/Reprocess/P202/extendedls1
File naming:
Data Type |
aka |
Send to FSSC |
Naming template |
---|---|---|---|
RECON |
|
No |
r<run#>_<version>_<dataType>.root |
CAL |
|
No |
r<run#>_<version>_<dataType>.root |
GCR |
|
No |
r<run#>_<version>_<dataType>.root |
MERIT |
|
No |
r<run#>_<version>_<dataType>.root |
FILTEREDMERIT |
|
No |
r<run#>_<version>_<dataType>.root |
ELECTRONMERIT |
|
No |
r<run#>_<version>_<dataType>.root |
ELECTRONFT1 |
|
No |
gll_el_p<procVer>_r<run#>_<version>.fit |
EXTENDEDFT1 |
|
No |
gll_xp_p<procVer>_r<run#>_<version>.fit |
FT1 |
LS-002 |
Yes |
gll_ph_p<procVer>_r<run#>_<version>.fit |
EXTENDEDLS1 |
|
No |
gll_xe_p<procVer>_r<run#>_<version>.fit |
LS1 |
LS-001 |
Yes |
gll_ev_p<procVer>_r<run#>_<version>.fit |
Note: 'procVer' is a field added to the file name (and the keyword "PROC_VER" in the primary header) added to the FFD 5/12/2010. Ref: http://fermi.gsfc.nasa.gov/ssc/dev/current_documents/Science_DP_FFD_RevA.pdf
Examples:
/glast/Data/Flight/Reprocess/P200/recon/r0239557414_v202_recon.root /glast/Data/Flight/Reprocess/P200/cal/r0239557414_v202_cal.root /glast/Data/Flight/Reprocess/P200/gcr/r0239557414_v202_gcr.root /glast/Data/Flight/Reprocess/P200/merit/r0239557414_v202_merit.root /glast/Data/Flight/Reprocess/P200/filteredmerit/r0239557414_v202_filteredmerit.root /glast/Data/Flight/Reprocess/P200/electronmerit/r0239557414_v202_electronmerit.root /glast/Data/Flight/Reprocess/P200/extendedft1/gll_xp_p202_r0239559565_v202.fit /glast/Data/Flight/Reprocess/P200/ft1/gll_ph_p202_r0239559565_v202.fit /glast/Data/Flight/Reprocess/P200/electronft1/gll_el_p202_r0239559565_v202.fit /glast/Data/Flight/Reprocess/P200/extendedls1/gll_xe_p202_r0239559565_v202.fit /glast/Data/Flight/Reprocess/P200/ls1/gll_ev_p202_r0239559565_v202.fit
DataCatalog location and naming
Logical directory and group template:
Data/Flight/Reprocess/<reprocessName>:<dataType>
Note that the <dataType> field (following the colon) is a DataCatalog 'group' name, and file names are of the form r<run#>.
Naming examples:
Data/Flight/Reprocess/P202:RECON r0239557414 Data/Flight/Reprocess/P202:CAL r0239557414 Data/Flight/Reprocess/P202:GCR r0239557414 Data/Flight/Reprocess/P202:MERIT r0239557414 Data/Flight/Reprocess/P202:FILTEREDMERIT r0239557414 Data/Flight/Reprocess/P202:EXTENDEDFT1 r0239557414 Data/Flight/Reprocess/P202:FT1 r0239557414 Data/Flight/Reprocess/P202:ELECTRONFT1 r0239557414 Data/Flight/Reprocess/P202:EXTENDEDLS1 r0239557414 Data/Flight/Reprocess/P202:LS1 r0239557414
Data Sample
The currently defined data sample for P202 reprocessing includes:
First run |
239557414 (MET), 2008-08-04 15:43:34 (UTC) |
|
Last run |
348951073 (MET), 2012-01-22 18:51:13 (UTC) |
|
Total runs |
19,172 |
|
Total input DIGI events |
41,856,513,685 |
|
Total RECON events |
|
|
Total CAL events |
|
|
Total GCR events |
|
|
Total MERIT events |
|
all "events" |
Total EXTENDEDFT1/LS1 events) |
|
all photon event classes |
Total LS1 (FSSC selection) events |
|
event classes (bits) 0,2,3,4 (transient, source, clean, ultraclean) |
Total FT1 (FSSC selection) events |
|
event classes (bits) 2,3,4 (source, clean, ultraclean) |
Total disk space used |
N/A |
|
NOTE: One run, 242429468, of type TrigTest was declared 'good for science' but long after this task got started, so it has been intentionally omitted.
[to be continued...]
Bookkeeping
- (This page): Define ingredients of reprocessing (processing code/configuration changes)
- Processing History database: http://glast-ground.slac.stanford.edu/HistoryProcessing/HProcessingRuns.jsp?processingname=P202
- List of all reprocessings
- List of all data runs reprocessed
- Pointers to all input data files (-> dataCatalog)
- Pointers to associated task processes (-> Pipeline II status)
- Data Catalog database: http://glast-ground.slac.stanford.edu/DataCatalog/folder.jsp
- Lists of and pointers to all output data files
- Meta data associated with each output data product
P202-ROOT
Status chronology
- 2/13/2012 - begin trials with final calibration and alignments from Leon; 5 runs reprocessed
- 2/14/2012 - trials continue with blocks of 15, 20 and 25 runs reprocessed (each run generates ~20 batch jobs)
Configuration
Task Location |
/nfs/farm/g/glast/u38/Reprocess-tasks/P202-ROOT |
Task Status |
http://glast-ground.slac.stanford.edu/Pipeline-II/index.jsp |
GlastRelease |
17-35-24-gr17 (SCons RHEL4-32 build) |
Run Selection |
based on "standard" selection Public Release 4 ( https://confluence.slac.stanford.edu/display/SCIGRPS/Official+LAT+Datasets) |
s/c data |
"standard" Public Release 2 ( https://confluence.slac.stanford.edu/display/SCIGRPS/Official+LAT+Datasets) |
Input Run List |
ftp://ftp-glast.slac.stanford.edu/glast.u38/Reprocess-tasks/P202-ROOT/config/runList.txt |
photonFilter |
CTBParticleType==0 && CTBClassLevel>0 |
electronFilter |
CTBParticleType==1 |
Code Variants |
redhat4-i686-32bit-gcc34 (Optimized) |
jobOpts |
ftp://ftp-glast.slac.stanford.edu/glast.u38/Reprocess-tasks/P202-ROOT/config/doRecon.txt |
Output Data Products |
Timing and Scaling
- processClump
- mergeClumps
Load balancing
trickleStream parameters:
P202-FITS
This task generates all desired FITS data products.
Status chronology
Configuration
-
-
- UNDER CONSTRUCTION, DO NOT DEPEND ON DATA!! ***
-
Task Location |
/nfs/farm/g/glast/u38/Reprocess-tasks/P202-FITS |
Task Status |
http://glast-ground.slac.stanford.edu/Pipeline-II/task.jsp?task=75031156 |
Input Data |
MERIT (from P202-ROOT) |
spacecraft data |
same as P202-ROOT |
Input Run List |
ftp://ftp-glast.slac.stanford.edu/glast.u38/Reprocess-tasks/P202-FITS/config/runList.txt |
Reprocessing Mode |
reFT1 |
meritFilter |
FT1EventClass!=0 |
evtClassDefs |
00-19-01 |
eventClassMap |
EvtClassDefs_P7V6.xml |
ScienceTools |
09-24-00 |
Code Variants |
redhat5-i686-32bit-gcc41 (Optimized) |
Diffuse Model |
based on contents of /afs/slac.stanford.edu/g/glast/ground/GLAST_EXT/diffuseModels/v2r0 ) |
Diffuse Response |
'source' using P7SOURCE_V6 IRF |
IRFs |
P6V7, contained within ScienceTools release |
Output Data Products |
Processing chain for FITS data products
Data Product |
selection |
makeFT1 |
gtdiffrsp |
gtmktime |
gtltcube |
---|---|---|---|---|---|
FT1 |
'source' and above |
true |
true |
true |
false |
LS1 |
'transient' and above |
true |
true |
true |
false |
FT1EXTENDED |
FT1EventClass!=0 |
true |
true |
true |
false |
LS1EXTENDED |
FT1EventClass!=0 |
true |
true |
true |
false |
ELECTRONFT1 |
CTBParticleType==1 |
true |
false |
true |
false |
Note that diffuse response is calculated for 'source' and 'clean' event classes only.
Note on 'Code Variant': The SLAC batch farm contains a mixture of architectures , both hardware (Intel/AMD 64-bit) and software (RHEL5-64, gcc v4.1, etc.). At this time, GlastRelease builds only on RHEL4-32, while ScienceTools builds for RHEL5-32, RHEL5-64.