P116 Reprocessing
status: In Progress
last update: 11 November 2010
This page is a record of the configuration for the P116 reprocessing project, rebuilding FT1 files (and friends) for all of science data with new, extended event classifications.
- P116-FT1 - this task will read existing (Level1) MERIT files and produce FT1 (photons) + LS1 ("fat" FT1) files
Datafile names, versions and locations
Data file version numbers for this reprocessing will begin with v116.
XROOT location and file naming
Location template:
/glast/Data/Flight/Reprocess/<reprocessName>/<dataType>
Locations for P116:
/glast/Data/Flight/Reprocess/P116/ft1 /glast/Data/Flight/Reprocess/P116/electronft1 /glast/Data/Flight/Reprocess/P116/ls1
File naming:
Data Type |
aka |
Send to FSSC |
Naming template |
---|---|---|---|
FT1 |
LS-002 |
Yes |
gll_ph_p<procVer>_r<run#>_<version>.fit |
LS1 |
LS-001 |
Yes |
gll_ev_p<procVer>_r<run#>_<version>.fit |
Note: 'procVer' is a field added to the file name (and the keyword "PROC_VER" in the primary header) added to the FFD 5/12/2010. Ref: http://fermi.gsfc.nasa.gov/ssc/dev/current_documents/Science_DP_ICD_RevA.pdf
Example:
/glast/Data/Flight/Reprocess/P116/ft1/gll_ph_p116_r0239559565_v116.fit /glast/Data/Flight/Reprocess/P116/ls1/gll_ev_p116_r0239559565_v116.fit
DataCatalog location and naming
Logical directory and group template:
Data/Flight/Reprocess/<reprocessName>:<dataType>
Note that the <dataType> field (following the colon) is a DataCatalog 'group' name, and file names are of the form r<run#>.
Naming examples:
Data/Flight/Reprocess/P116:FT1 r0239557414 Data/Flight/Reprocess/P116:LS1 r0239557414
Data Sample
The currently defined data sample for P116 reprocessing includes the following runs.
Block |
First run |
First run |
Last run |
Last run |
#Runs |
#Merit Evts |
DataCatalog Source |
Note |
---|---|---|---|---|---|---|---|---|
1 |
239557414 |
2008-08-04 15:43:34 |
242047683 |
2008-09-02 11:28:02 |
431 |
909,050,672 |
/Data/Test/Flight/Repro/ReproTest7 |
reprocessed for alignment fix |
1 |
242053458 |
2008-09-02 13:04:17 |
307579060 |
2010-09-30 22:37:38 |
11413 |
24,906,606,023 |
/Data/Flight/Level1/LPA/ |
Standard Level 1 output (ignore StdIntent) |
1 Subtotal |
|
|
|
|
11844 |
25,815,656,695 |
|
|
2 |
307585048 |
2010-10-01 00:17:26 |
311108362 |
2010-11-10 18:59:20 |
617 |
1,368,042,493 |
/Data/Flight/Level1/LPA/ |
Standard Level 1 output |
Grand Total |
|
|
|
|
12461 |
27,183,699,188 |
|
|
It is expected that a backfill operation will happen to mesh with Level 1 once it begins to produce these new event classifications. Ultimately, these data are destined for the FSSC.
Bookkeeping
- (This page): Define ingredients of reprocessing (processing code/configuration changes)
- Processing History database: http://glast-ground.slac.stanford.edu/HistoryProcessing/HProcessingRuns.jsp?processingname=P116
- List of all reprocessings
- List of all data runs reprocessed
- Pointers to all input data files (-> dataCatalog)
- Pointers to associated task processes (-> Pipeline II status)
- Data Catalog database: http://glast-ground.slac.stanford.edu/DataCatalog/folder.jsp
- Lists of and pointers to all output data files
- Meta data associated with each output data product
P116-FT1
This task is roughly equivalent to the P105-FT1 task but with modernization.
Status chronology
- 11/11/2010 - gear up for reprocessing data block2
- 10/18/2010 - 146 mergeClumps jobs ran out of CPU time. Changed to xxl batch queue and rolled back.
- 10/14/2010 - Validation complete, begin production for data block1
- 10/1/2010 - Begin task construction
Configuration
Task Location |
/nfs/farm/g/glast/u38/Reprocess-tasks/P116-FT1 |
Task Status |
http://glast-ground.slac.stanford.edu/Pipeline-II/index.jsp |
Input Data Selection |
MERIT |
Input Run List |
ftp://ftp-glast.slac.stanford.edu/glast.u38/Reprocess-tasks/P116-FT1/config/runFile.txt |
Reprocessing Mode |
reFT1 |
evtClassDefs |
00-18-00 |
photon cut |
pass6_FSW_cuts |
eventClassifier |
dataclean_classifier.py |
eventClassMap |
n/a |
s/c data |
FT2 from P105 (runs 239557414 - 271844560), then from current Level 1 production |
ScienceTools |
09-18-05 (SCons build) |
Allowed Code Variants |
redhat4-i686-32bit-gcc34, redhat4-x86_64-64bit-gcc34, |
Diffuse Model |
/afs/slac.stanford.edu/g/glast/ground/releases/analysisFiles/diffuse/v2/source_model_v02.xml ) |
Diffuse Response IRFs |
P6_V3_DIFFUSE, P6_V3_DATACLEAN |
Output Data Products |
Processing chain for FITS data products
Data Product |
makeFT1 |
gtdiffrsp |
gtmktime |
gtltcube |
---|---|---|---|---|
FT1 |
true |
true for |
true |
false |
LS1 |
true |
false |
true |
false |
Note on 'Code Variant': The SLAC batch farm contains a mixture of architectures , both hardware (Intel/AMD 64-bit) and software (RedHat Enterprise Linux 4, and 5, gcc 3.4, 4.1, etc.). Currently, all batch machines have 64-bit hardware and (most, if not all) run with the 64-bit operating systems.
Note on diffuse response calculation: gtdiffrsp is called two times in succession. The first time with IRF P6_V3_DIFFUSE and evclass==3, followed by IRF P6_V3_dataclean and evclass==4. The resulting FT1 file has four columns of diffuse response, two columns (galactic and extragalactic response) for each of the two IRFs.
Timing
- The most time-consuming step in P116-FT1 is the 'mergeClump' step wherein the diffuse response is calculated (twice). The CPU time required for this step, after ~2300 jobs have completed appears in the following plot:
Thus, depending on the run and the speed of the batch machine (bimodal distribution for fell vs hequ), the jobs are finishing up after anywhere from 2 hours to 11 hours, with the most probable being about 5 hours.
With 1000 cores, one might then estimate 60 hours to complete the entire project. Near final timing