Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Officially started on Dec 22, 2015.

HPS-JAVA SnapshotRelease:  3   3.5-20151218.205540-158-fix

DST-maker Maker Release:  0.10

Run Spreadsheet:   Google Docs

Detector:  HPS-EngRun2015-Nominal-v3v4-4-fieldmap

Batch Farm Scripts:

  • /u/group/hps/production/data/EngRun2015/
    pass4
    pass6
  • github

Outputs:

...

:

...

  • Disk:  /work/hallb/hps/data/engrun2015/pass4pass6

    • All DQ(M) and logs

    • Unskimmed Recon:

      • LCIO:  Canonical and Calibration runs only
      • DST:  All
    • Skimmed Recon:

      • LCIO:  moller, pulser, v0
      • DST:  All (fee, moller, pulser, v0, s0, p0)
    • v0 skims (the *0 files only)

    • v0pulser skim

  • Tape:  /mss/hallb/hps/engrun2015/pass4pass6
    • Everything
  • Skims:  FEE, Moller, Pulser, V0, S0, P0

Changes Relative to Pass-3:4 (pass5 was aborted and does not exist):

  • Full 100% pass.
  • ECAL halves shifted asymmetrically in y based only on tracking.
  • Fixed track state at ECal with full 3-D field extrapolator
  • Improved ECal energy and position corrections
  • Corrected SVT 1.5 mm alignment
  • New track/cluster matching with quality factor now in ReconParticles
  • GBL running only in hps-java, not in dst-maker
  • New collection to store RF time extracted from waveform
  • ...


Steering Files:

Recon/org/hps/steering/recon/EngineeringRun2015FullRecon.lcsim
DQ/org/hps/steering/production/DataQuality.lcsim
DQM/org/hps/steering/production/DataQualityRecon.lcsim
Pulser/org/hps/steering/production/PulserTriggerFilter.lcsim
Moller/org/hps/steering/production/MollerCandidateFilter.lcsim
FEE/org/hps/steering/production/FEEFilter.lcsim
V0/org/hps/steering/production/V0CandidateFilter.lcsim
S0
/org/hps/steering/production/Single0TriggerFilter.lcsim
P0
/org/hps/steering/production/Pair0TriggerFilter.lcsim


Output Directory Structure:

Result:

3.5K EVIO files processed in about 5 days on the batch farm (in competition with upass4) with a "failure" rate of only 1% (almost entirely due to timeouts, limit was set at 30 hours).  Failures resubmitted on 1/1/2016 with a much larger time limit (7 of 30 jobs timed out again, appears to be stuck in some Minuit stuff in dqm).

30% of the EVIO files resulted in empty (filesize=0) v0 and moller skims (here's a list of their run/file numbers). 80% of those empties were from 1.5 mm, which is about 50% of 1.5 mm.  Note that files of zero size cannot be written to tape, so you will not see them in /mss, and that no jobs resulted in empty pulser skims.

Here's a histogram of wall time per EVIO file (the 800 empty 1.5 mm v0/moller skims surely correspond to the peak at 9.5 hours, although I did not confirm that), where the bump at 30 hours is timeout failures:

Image Removed

Disk Usage:

/work/hallb/hps/data/engrun2015/pass4

2.5 TB:

Skim Cuts:

MollerV0FEES0,P0,Pulser

p1,p2 < 0.85 GeV

GBL Only

e > 0.6 GeV

TI Trigger Bit
0.85 < p1+p2 < 1.3 GeV

χ2vertex < 10

s > 0.4 GeV 
|t1-t2| < 2.5 nsχ2track < 20  
40 < t1,t2 < 48 nsp1,p2 > 0.9 GeV  
-175 < x1+x2 < -145 mmp1+p2 < 1.4 GeV  
|x1-x2| < 80 mm   
0.85 < e1+e2 < 1.1 GeV   
|e1-e2| < 0.3 GeV   

 

=============================== Status of finished jobs =====================================


====================== Summary of Missing files ======================
Number of Missing Recons              30
Number of Missing DSts                0
Number of Missing DQMs                751
Number of Missing pulsers             5
Number of Missing pulser DSts         0
Number of Missing s0s                 2
Number of Missing s0 DSTs             0
Number of Missing p0s                 3
Number of Missing p0 DSTs             0
Number of Missing fees                1
Number of Missing fee DSTs            3
Number of Missing Mollers             9503
Number of Missing Moller DSTs         9504
Number of Missing V0s                 9502
Number of Missing V0 DSTs             9503
================================================================

 

Following are recon jobs  that were failed.

Most of them are timed out, however there are some that, that are crashed at the beginnig

Runfile numberFailure reason
5783105timeout
577574timeout
57544Exception_1
5741161timeout
5694186timeout
5610148timeout
5579103timeout
5578435timeout
55688Exception_1
556618Exception_1
554632Exception_1
55413Exception_1
5410201timeout, Tons of Exception2 (probably every event)
540598timeout
5381127timeout
534847timeout
53450timeout
5313100timeout
528811timeout
526733timeout
525644timeout
46
50
525713timeout
17
32
34
36
39
525418timeout


Exception_1

2016-05-18 13:28:38 [INFO] org.hps.evio.LCSimEngRunEventBuilder setTiTimeOffsetForRun :: TI time offset set to 1431298043480321136 for run 5568 from database
2016-05-18 13:28:38 [INFO] org.hps.conditions.database.DatabaseConditionsManager initialize :: conditions system initialized successfully
2016-05-18 13:28:38 [CONFIG] org.hps.conditions.database.DatabaseConditionsManager freeze :: conditions system is frozen
2016-05-18 13:28:38 [INFO] org.hps.evio.EvioToLcio run :: Opening EVIO file in.evio
Exception in thread "main" java.lang.NegativeArraySizeException
    at org.jlab.coda.jevio.EventParser.parseStructure(EventParser.java:126)
    at org.jlab.coda.jevio.EventParser.parseEvent(EventParser.java:62)
    at org.jlab.coda.jevio.EvioReader.parseEvent(EvioReader.java:1449)
    at org.hps.evio.EvioToLcio.bufferEvents(EvioToLcio.java:186)
    at org.hps.evio.EvioToLcio.run(EvioToLcio.java:501)
    at org.hps.evio.EvioToLcio.main(EvioToLcio.java:98)


Exception_2

org.hps.evio.AugmentedSvtEvioReader processSvtHeaders :: Caught 11 SvtEvioHeaderExceptions for event 53391095 of 3 types: SvtEvioHeaderApvBufferAddressException SvtEvioHeaderApvFrameCountException SvtEvioHe    aderSyncErrorException


 Image Removed