Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

First run

239557414 (MET), 2008-08-04 15:43:34 (UTC)

 

Last run

348951073 354923690 (MET), 2012-0103-22 1831 21:5154:13 48 (UTC)

 

Total runs

19 20,172 229

 

Total input DIGI events

41 44,856125,513679,685 961

 

Total RECON events  

44,125,679,961

 

Total CAL events  

44,125,679,961

 

Total GCR events  

44,125,679,961

 

Total MERIT events  

44,125,679,961

all "events"

Total EXTENDEDFT1/LS1 FILTEREDMERIT events

6,291,396,711

all photon event classes

Total ELECTRONMERIT events

90,904,582

all electron events

Generation of FITS files is a second step in the reprocessing and has only been run on the first year of data. Stay tuned...

Total EXTENDEDFT1/LS1 events

 

selected

 

all photon event classes

Total LS1 (FSSC selection) events

 

event classes (bits) 0,2,3,4 (transient, source, clean, ultraclean)

Total FT1 (FSSC selection) events

 

event classes (bits) 2,3,4 (source, clean, ultraclean)

Total disk space used

N/A

 

NOTE: One run, 242429468, of type TrigTest was declared 'good for science' and has been included.

Progress at the 1-year mark:

First run

239557414 (MET), 2008-08-04 15:43:34 (UTC)

 

Last run

271999199 (MET), 2009-08-15 03:19:57 (UTC)

 

Total runs

5600

 

Total input DIGI events

11,928,911,465

 

Total RECON events

11,928,911,465

161.4 TB

Total CAL events

11,928,911,465

36.1 TB

Total GCR events

11,928,911,465

260.5 GB

Total MERIT events

11,928,911,465

9.6 TB

all triggered events

Total FILTEREDMERIT events

1,572,783,868

1.3 TB

all photon event classes

Total EXTENDEDFT1

1,572,783,826

143.7 GB

all photon event classes

Total LS1 events

1,572,783,826

255.0 GB

all photon event classes

Total LS1 (FSSC selection) events

271,923,333

44.2 GB

event classes (bits) 0,2,3,4 (transient, source, clean, ultraclean)

Total FT1 (FSSC selection) events

24,261,962

2.4 GB

event classes (bits) 2,3,4 (source, clean, ultraclean)

Wiki Markup
\[to be continued...\]

Bookkeeping

  1. (This page): Define ingredients of reprocessing (processing code/configuration changes)
  2. Processing History database: http://glast-ground.slac.stanford.edu/HistoryProcessing/HProcessingRuns.jsp?processingname=P202
    1. List of all reprocessings
    2. List of all data runs reprocessed
    3. Pointers to all input data files (-> dataCatalog)
    4. Pointers to associated task processes (-> Pipeline II status)
  3. Data Catalog database: http://glast-ground.slac.stanford.edu/DataCatalog/folder.jsp
    1. Lists of and pointers to all output data files
    2. Meta data associated with each output data product

...

Bookkeeping

  1. (This page): Define ingredients of reprocessing (processing code/configuration changes)
  2. Processing History database: http://glast-ground.slac.stanford.edu/HistoryProcessing/HProcessingRuns.jsp?processingname=P202
    1. List of all reprocessings
    2. List of all data runs reprocessed
    3. Pointers to all input data files (-> dataCatalog)
    4. Pointers to associated task processes (-> Pipeline II status)
  3. Data Catalog database: http://glast-ground.slac.stanford.edu/DataCatalog/folder.jsp
    1. Lists of and pointers to all output data files
    2. Meta data associated with each output data product

...

P202-ROOT
Anchor
P202-ROOT
P202-ROOT

Status chronology

  • 2/13/2012 - begin trials with final calibration and alignments from Leon; 5 runs reprocessed
  • 2/14/2012 - trials continue with blocks of 15, 20, 25 and 50 runs reprocessed (each run generates ~20 batch jobs)
  • 2/16/2012 - begin trickleStream production. Initial config:
    Code Block
    
    ===============================================================================
      TRICKLE PARMS
    ===============================================================================
    task =  P202-ROOT
    maxRuns =  19172
    firstStep =  setupRun
    steps =  [['/processRun processClump', 1500, 20], ['mergeClumps', 70, 1]]
    maxStreamsPerCycle =  20
    timePerCycle =  300
    ===============================================================================
    
  • 2/21/2012 - One clump reprocessed with pointer to new mySQL DB (stream 710.0)
  • 2/22/2012 - 776 runs complete. Pausing task.
  • 3/15/2012 - resume task. New goal is 1-year of data (~5600 runs)
  • 3/31/2012 - 1-year complete (5600 runs). There have been a few nasty problems which need to be fixed before continuing.

    S/W component

    bug fix

    status

    New ROOT version

    5-min 'transaction timeout' triggered by xroot data server reboot

    done 4/3/2012

    New GlastRelease

    1) include new ROOT version (above); 2) exit with non-zero RC on ROOT write error

    done 4/5/2012, GR 17-35-24-rp04

    New GPL_TOOLS(question)

    check size/checksum of file written to xroot with known size/checksum

    pending

    Tuned xroot on new Dell servers

    silent file truncation when volume fills up JIRA

    done 4/4/2012 (100 MB min space limit -> 100 GB; file system space check cadence changed from 10 min to 2 min)

    New xroot client tools

    complain when xroot data server fails on write

    done 4/3/2012, v3.1.1

    New TSkim

    1) new ROOT version (above); 2) complain on ROOT write errors

    done 4/5/2012, v08-02-01

    New xroot redirector

    required step toward enabling HPSS staging

    done 4/3/2012, v3.1.1

    Note also that the FILTEREDMERIT files contain 42 more events than the EXTENDEDFT1 files; they should be identical.
  • 4/5/2012 - resume task. New goal is entire science dataset.
  • 4/10/2012 - Unknown 'glitch' may have caused a few 100's of jobs to crash and take sulky46 along with them.
  • 4/11/2012 - 10:40pm lightening strikes SLAC power lines. Site-wide power outage. Stream 7795 was the last stream submitted prior to the outage.
  • 4/12/2012 - due to possible overload of sulky46/u18 writing a lot of core files, have introduced one change to processClumps.py: prepend "ulimit -c 0;" to gleam command to disable all core file generation. This starts approx with run 7605 (+/-).
  • 5/9/2012 - major pipeline issue...shut down pipeline and allow to drain (due to tomorrow's major outage)
  • 5/10/2012 - 13:40 outage over.
    • Update GR from 17-35-24-rp04 to 17-35-24-rp07 in which the only change is replacing the 5-minute xroot time-out with 8 hours. This change effective with stream 14314 and previously failed pieces of four other runs: 14247.6, 14273.23, 14274.8, 14231.9.
    • Leon advises that as of today, calibrations are valid only thru ~15 Dec 2011 (run 345574915) - which is somewhere around stream 18,400. He asks Sasha to produce more up-to-date calibs.
  • 5/18/2012 - all calibrations now valid through 6 May 2012. No need to pause P202 task.
  • 5/28/2012 - 15:30 Complete (through 31 March 2012)
    • Data Catalog summary:  

      Name

      Type

      Files

      Events

      Size

      Created (UTC)

      Links

      CAL

      Group

      20229

      44,125,599,595

      128.7 TB

      25-Jan-2012 00:53:31

      Files

      ELECTRONFT1

      Group

      5600

      0

      2.5 GB

      02-Mar-2012 00:06:07

      Files

      ELECTRONMERIT

      Group

      20229

      90,904,582

      205.7 GB

      25-Jan-2012 00:53:32

      Files

      EXTENDEDFT1

      Group

      5600

      1,572,783,826

      143.7 GB

      02-Mar-2012 00:06:09

      Files

      EXTENDEDLS1

      Group

      5600

      1,572,783,826

      255.0 GB

      02-Mar-2012 00:06:09

      Files

      FILTEREDMERIT

      Group

      20229

      6,291,396,710

      5.3 TB

      25-Jan-2012 00:53:29

      Files

      FT1

      Group

      5600

      24,261,962

      2.4 GB

      02-Mar-2012 00:06:06

      Files

      GCR

      Group

      20229

      44,123,014,456

      942.7 GB

      25-Jan-2012 00:53:31

      Files

      LS1

      Group

      5600

      271,923,333

      44.2 GB

      02-Mar-2012 00:06:08

      Files

      MERIT

      Group

      20229

      44,125,679,961

      35.4 TB

      25-Jan-2012 00:53:30

      Files

      RECON

      Group

      20229

      44,123,612,977

      590.0

...

Status chronology

  • 2/13/2012 - begin trials with final calibration and alignments from Leon; 5 runs reprocessed
  • 2/14/2012 - trials continue with blocks of 15, 20, 25 and 50 runs reprocessed (each run generates ~20 batch jobs)
  • 2/16/2012 - begin trickleStream production. Initial config:
    Code Block
    
    ===============================================================================
      TRICKLE PARMS
    ===============================================================================
    task =  P202-ROOT
    maxRuns =  19172
    firstStep =  setupRun
    steps =  [['/processRun processClump', 1500, 20], ['mergeClumps', 70, 1]]
    maxStreamsPerCycle =  20
    timePerCycle =  300
    ===============================================================================
    
  • 2/21/2012 - One clump reprocessed with pointer to new mySQL DB (stream 710.0)
  • 2/22/2012 - 776 runs complete. Pausing task.
  • 3/15/2012 - resume task. New goal is 1-year of data (~5600 runs)
  • 3/31/2012 - 1-year complete (5600 runs). There have been a few nasty problems which need to be fixed before continuing.

    S/W component

    bug fix

    status

    New ROOT version

    5-min 'transaction timeout' triggered by xroot data server reboot

    done 4/3/2012

    New GlastRelease

    1) include new ROOT version (above); 2) exit with non-zero RC on ROOT write error

    done 4/5/2012, GR 17-35-24-rp04

    New GPL_TOOLS(question)

    check size/checksum of file written to xroot with known size/checksum

    pending

    Tuned xroot on new Dell servers

    silent file truncation when volume fills up JIRA

    done 4/4/2012 (100 MB min space limit -> 100 GB; file system space check cadence changed from 10 min to 2 min)

    New xroot client tools

    complain when xroot data server fails on write

    done 4/3/2012, v3.1.1

    New TSkim

    1) new ROOT version (above); 2) complain on ROOT write errors

    done 4/5/2012, v08-02-01

    New xroot redirector

    required step toward enabling HPSS staging

    done 4/3/2012, v3.1.1

    Note also that the FILTEREDMERIT files contain 42 more events than the EXTENDEDFT1 files; they should be identical.
  • 4/5/2012 - resume task. New goal is entire science dataset.
  • 4/10/2012 - Unknown 'glitch' may have caused a few 100's of jobs to crash and take sulky46 along with them.
  • 4/11/2012 - 10:40pm lightening strikes SLAC power lines. Site-wide power outage. Stream 7795 was the last stream submitted prior to the outage.
  • 4/12/2012 - due to possible overload of sulky46/u18 writing a lot of core files, have introduced one change to processClumps.py: prepend "ulimit -c 0;" to gleam command to disable all core file generation. This starts approx with run 7605 (+/-).
  • 5/9/2012 - major pipeline issue...shut down pipeline and allow to drain (due to tomorrow's major outage)
  • 5/10/2012 - 13:40 outage over.
    • Update GR from 17-35-24-rp04 to 17-35-24-rp07 in which the only change is replacing the 5-minute xroot time-out with 8 hours. This change effective with stream 14314 and previously failed pieces of four other runs: 14247.6, 14273.23, 14274.8, 14231.9.
    • Leon advises that as of today, calibrations are valid only thru ~15 Dec 2011 (run 345574915) - which is somewhere around stream 18,400. He asks Sasha to produce more up-to-date calibs.
  • 5/18/2012 - all calibrations now valid through 6 May 2012. No need to pause P202 task.
  • 5/28/2012 - 15:30 Complete (through 31 March 2012)
    • Data Catalog summary:  

      Name

      Type

      Files

      Events

      Size

      Created (UTC)

      Links

      CAL

      Group

      20229

      44,125,599,595

      128.7 TB

      25-Jan-2012 00:53:31 33

      Files

      ELECTRONFT1

      Group

      5600

      0

      2.5 GB

      02-Mar-2012 00:06:07

      Files

      ELECTRONMERIT

      Group

      20229

      90,904,582

      205.7 GB

      25-Jan-2012 00:53:32

      Files

      EXTENDEDFT1

      Group

      5600

      1,572,783,826

      143.7 GB

      02-Mar-2012 00:06:09

      Files

      EXTENDEDLS1

      Group

      5600

      1,572,783,826

      255.0 GB

      02-Mar-2012 00:06:09

      Files

      FILTEREDMERIT

      Group

      20229

      6,291,396,710

      5.3 TB

      25-Jan-2012 00:53:29

      Files

      FT1

      Group

      5600

      24,261,962

      2.4 GB

      02-Mar-2012 00:06:06

      Files

      GCR

      Group

      20229

      44,123,014,456

      942.7 GB

      25-Jan-2012 00:53:31

      Files

      LS1

      Group

      5600

      271,923,333

      44.2 GB

      02-Mar-2012 00:06:08

      Files

      MERIT

      Group

      20229

      44,125,679,961

      35.4 TB

      25-Jan-2012 00:53:30

      Files

      RECON

      Group

      20229

      44,123,612,977

      590.0 TB

      25-Jan-2012 00:53:33

      Files

      There are discrepancies to track down!
      Turns out to be three problematic runs/streams:
      • 272707024/5723 - I/O prob, corrupt files, entire stream rolled back
      • 279108810/6847 - xroot transient access prob., re-registered in dataCat
      • 284813327/7848 - xroot transient access prob., re-registered in dataCat
      Final trickleStream configuration: Code Block ====== There are discrepancies to track down!
      Turns out to be three problematic runs/streams:
      • 272707024/5723 - I/O prob, corrupt files, entire stream rolled back
      • 279108810/6847 - xroot transient access prob., re-registered in dataCat
      • 284813327/7848 - xroot transient access prob., re-registered in dataCat
  • Final trickleStream configuration:
    Code Block
    
    ===============================================================================
      TRICKLE PARMS
    ===============================================================================
    task =  P202-ROOT
    maxRuns =  20229
    firstStep =  setupRun
    steps =  [['/processRun processClump', 2000, 21], ['mergeClumps', 200, 1]]
    maxStreamsPerCycle =  20
    timePerCycle =  300
    ------DEBUG----------------
    maxCycles =  0
    chatter =  False
    dryRun =  False
    ===============================================================================
    =============== TRICKLE PARMS =============================================================================== task = P202-ROOT maxRuns = 20229 firstStep = setupRun steps = [['/processRun processClump', 2000, 21], ['mergeClumps', 200, 1]] maxStreamsPerCycle = 20 timePerCycle = 300 ------DEBUG---------------- maxCycles = 0 chatter = False dryRun = False ===============================================================================
    
    
  • 5/31/2012 - Rolling back all or part of the three runs above solved the discrepancies in # events.  New dataCatalog tally looks like this:

Name

Type

Files

Events

Size

Created (UTC)

Links

CAL

Group

20229

44,125,679,961

128.7 TB

25-Jan-2012 00:53:31

Files

ELECTRONMERIT

Group

20229

90,904,582

205.7 GB

25-Jan-2012 00:53:32

Files

FILTEREDMERIT

Group

20229

6,291,396,711

5.3 TB

25-Jan-2012 00:53:29

Files

GCR

Group

20229

44,125,679,961

942.7 GB

25-Jan-2012 00:53:31

Files

LS1

Group

5600

271,923,333

44.2 GB

02-Mar-2012 00:06:08

Files

MERIT

Group

20229

44,125,679,961

35.4 TB

25-Jan-2012 00:53:30

Files

RECON

Group

20229

44,125,679,961

590.0 TB

25-Jan-2012 00:53:33

Files

Configuration

Task Location

/nfs/farm/g/glast/u38/Reprocess-tasks/P202-ROOT

Task Status

http://glast-ground.slac.stanford.edu/Pipeline-II/index.jsp

GlastRelease

17-35-24-gr17 (SCons RHEL4-32 build)

Run Selection

based on a modified "standard" selection, see https://confluence.slac.stanford.edu/display/SCIGRPS/Official+LAT+Datasets
(((sIntent=="nomSciOps" || sIntent=="nomSO_noSk_noCno_optGccc_allEna" || sIntent=="nomSciOps_diagEna" || (sIntent=="nomSciOps_Emin5MeV"&&RunMin>242070455) || nRun==242429468 ) && (RunQuality != "Bad" || is_null ( RunQuality ) ) ) || sIntent=="nadirOps" )

s/c data

"standard" Public Release 2 https://confluence.slac.stanford.edu/display/SCIGRPS/Official+LAT+Datasets

Input Run List

ftp://ftp-glast.slac.stanford.edu/glast.u38/Reprocess-tasks/P202-ROOT/config/runList.txt

photonFilter

CTBParticleType==1 && ((FT1EventClass & 0x00003EFF)!=0)
pass7.6_Extended_cuts_L1 in evtClassDefs

electronFilter

CTBParticleType==1

Code Variants

redhat4-i686-32bit-gcc34 (Optimized)

jobOpts

ftp://ftp-glast.slac.stanford.edu/glast.u38/Reprocess-tasks/P202-ROOT/config/doRecon.txt

Output Data Products

RECON, GCR, CAL, MERIT, FILTEREDMERIT, ELECTRONMERIT

...

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="ea60160a858c225b-68b99397-46a04d43-94369d50-376e3c7a8edc2f319402620d"><ac:plain-text-body><![CDATA[

Data Product

destination

data content [1]

event selection [1]

makeFT1

gtselect

gtdiffrsp

gtmktime

]]></ac:plain-text-body></ac:structured-macro>

EXTENDEDFT1

SLAC

FT1variables

((FT1EventClass & 0x00003EFF)!=0)
pass7.6_Extended_cuts_L1

(tick)

(error)

(tick)

(tick)

FT1

FSSC+SLAC

FT1variables

'source' and above
EVENT_CLASS bits 2,3,4
evclass=2 filtered from EXTENDEDFT1

(error)

(tick)

(inherited)

(tick)

EXTENDEDLS1

SLAC

LS1variables

((FT1EventClass & 0x00003EFF)!=0)
pass7.6_Extended_cuts_L1

(tick)

(error)

(tick)

(tick)

LS1

FSSC+SLAC

LS1variables

'transient' and above
EVENT_CLASS bits 0,2,3,4
evclass=0 filtered from EXTENDEDLS1

(error)

(tick)

(inherited)

(tick)

ELECTRONFT1

SLAC

FT1variables

CTBParticleType==1
pass7.6_Electrons_cuts_L1

(tick)

(error)

(error)

(tick)

...