Notes on the Big Run and Monte Carlo Data Access

[prepared for short presentation on 19 March 2008]

What is the Big Run?

What *types* of MC datasets are typically generated? 

What MC datasets have been generated?

What triggers the generation of new MC datasets?

What is scheduled for the near (pre-launch) future?

What about Pass6?

How does one stay up-to-date with regard to exactly what is available and planned?

Once I've figured out what data to use, how do I get my hands on it?



What is the Big Run?

A catch-all term encompassing much of all production Monte Carlo work since December 2007. There are many goals (which change with time!), including:

  • produce a very large background dataset (8 full days)
  • produce a 5-day background+full sky model for OpsSim2
  • produce a 1-year interleaved background+full sky model dataset

What *types* of MC datasets are typically generated? 

  • Relatively frequent "standard" datasets (with selected, typical configurations)
    • allGamma
      • 100% gammas between 18 MeV and 562 GeV
      • Normal trigger, NO OBF
      • ~10M evts generated
      • generated on upper hemisphere
    • allMuon
      • 100% muons
      • NO trigger, NO OBF
      • ~few M evts generated
    • sample day background
      • all background sources
        ***spanning one calendar day (0.2s runs every minute => 1440 runs)
      • NO trigger, NO OBF
    • background
      • all background sources
      • temporally contiguous runs (large statistics...~1 B evts generated)
      • normal trigger, normal OBF
  • Less frequent specialized datasets
    • background + full sky model
    • Interleaved background + full sky model
    • GRBgrid (a GRB in every job)
    • special trigger and/or OBF
    • Unusual orientations of the s/c, incl. pointed observations
    • special orbit files

What MC datasets have been generated?

  • All production MC datasets are created as GLAST Pipeline "tasks" via batch jobs on the SLAC and Lyon compute farms.
  • The production of essentially all MC datasets is documented on this web page, Service Challenge Monte Carlo Processing Summary
  • Note that this Confluence page documents the production, but neither the code nor the complete configuration that goes into the production.

What triggers the generation of new MC datasets?

  • A (significant) new GlastRelease (e.g., a new generation of classification tree analysis)
  • Request from C&A group
  • Request from other analysis or detector group
  • Requests must (ultimately) be channeled through Richard Dubois

What is scheduled for the near (pre-launch) future?

  • Will we do the 1-year interleaved background + full sky model run? Open question...
  • See Richard's Big Run Checklist page for all the details
  • Richard and Julie are your best contacts for Big Run plans (but they are both really busy, so first try to discover the answer yourself)
  • Hopefully fixing the rough edges discovered during OpsSim2 data processing review

What about Pass6?

  • There are as of this moment NO production MC datasets with Pass6 analysis
  • As of yesterday morning still hammering out the finishing touches on the definition of Pass6 and the mechanism to reprocess some existing data
  • Once Pass6 is up and running, one should expect all future MC datasets to use it
  • With some luck, the opssim2 dataset (8 full days background + sky model) may be reprocessed by sometime next week...stay tuned.

How does one stay up-to-date with regard to exactly what is available and planned?


Once I've figured out what data to use, how do I get my hands on it?

  • No labels