This page gathers a bit of information about CPU time used by the Pass8 GRs.
Here is the associate Jira item: LPATE-102@JIRA

Summary

  • Pretty stable in the past months, we've improved some things but also added more
  • As of GR 20-08-10: Tkr, CAL2 and Other are each 1/3 of total CPU
  • CRFinder is 15% of total CPU, must be speed up
  • Have to look into the merit tuple Auditor variables
  • Work to be done on outliers
    • Outliers in the plots below are jobs running on slow machines (fell)
    • Outliers seem to be related to swapping
    • I'll try to ask for more 2GB of RAM instead of just 1GB when submitting jobs, it may help speeding up things

Plots

  • I gather data from log files of the Pass8 Solar Flare reprocessing task:
    /nfs/farm/g/glast/u44/CalibSets-Tasks/Pass8_SFRs_Repro/output/Pass8_SFRs_Repro/"version"/run-setupCrumbs/digi-skimNrepro-crumb/runPass8Repro/"run"/"chunk"/"crumb"/logFile.txt
    with "version" in ['2.6', '2.7', '2.8', '2.9']
    corresponding to GR 20-08-06 to 20-08-10
    
  • All CPU times are normalized to jobs running on the hequ workers
  • CPU Time since 20-08-00
  • Plot for the past 4 releases: 20-08-06 to 20-08-10 (GRScan_p6Top10.root)

    CPU Time

    RAM

    Swap

    Cal2 (Energy)

    Tkr First Pass

    Tkr Tree Links

    Tkr Tree

    | | |

Breaking up 20-08-11

  • Global performances
  • Outliers at 3 sigma
    Min/Average/Max CPU time 548/2521/6401
    CPU outliers
    host hequ run 1 chunk 15 crumb 45 normcpu 2751.960000
    host kiso run 2 chunk 22 crumb 10 normcpu 2953.872000
    host kiso run 2 chunk 22 crumb 46 normcpu 2685.288000
    host kiso run 2 chunk 22 crumb 42 normcpu 2820.112000
    host hequ run 2 chunk 22 crumb 38 normcpu 3027.380000
    host kiso run 2 chunk 22 crumb 124 normcpu 2654.792000
    host kiso run 2 chunk 22 crumb 123 normcpu 2795.560000
    host fell run 2 chunk 21 crumb 52 normcpu 2883.887500
    host kiso run 2 chunk 5 crumb 92 normcpu 2854.632000
    host kiso run 2 chunk 5 crumb 76 normcpu 2829.296000
    host kiso run 2 chunk 5 crumb 24 normcpu 2767.400000
    host fell run 2 chunk 23 crumb 18 normcpu 4000.725000
    
    SWAP outliers
    host boer run 1 chunk 15 crumb 39 swap 2545.000000
    host boer run 1 chunk 15 crumb 25 swap 2584.000000
    host boer run 1 chunk 24 crumb 13 swap 2553.000000
    host fell run 2 chunk 22 crumb 132 swap 2551.000000
    host dole run 2 chunk 5 crumb 63 swap 2591.000000
    
    CAL2 outliers
    host kiso run 2 chunk 22 crumb 10 normcal2 931.200000
    host kiso run 2 chunk 22 crumb 42 normcal2 936.000000
    host fell run 2 chunk 21 crumb 52 normcal2 1245.000000
    host kiso run 2 chunk 5 crumb 76 normcal2 960.000000
    host fell run 2 chunk 23 crumb 18 normcal2 1102.500000
    
    TREE outliers
    host hequ run 1 chunk 15 crumb 45 normtree 1320.000000
    host hequ run 2 chunk 22 crumb 38 normtree 1590.000000
    host boer run 2 chunk 5 crumb 42 normtree 920.000000
    host fell run 2 chunk 23 crumb 18 normtree 2073.750000
    
    LINK outliers
    host kiso run 2 chunk 22 crumb 10 normlink 389.600000
    host kiso run 2 chunk 22 crumb 46 normlink 380.000000
    host kiso run 2 chunk 22 crumb 42 normlink 385.600000
    host kiso run 2 chunk 22 crumb 123 normlink 387.200000
    host kiso run 2 chunk 5 crumb 92 normlink 360.000000
    host kiso run 2 chunk 5 crumb 76 normlink 402.720000
    
    FIRST outliers
    host hequ run 1 chunk 15 crumb 45 normfirst 1812.000000
    host kiso run 2 chunk 22 crumb 10 normfirst 1627.200000
    host kiso run 2 chunk 22 crumb 42 normfirst 1502.400000
    host hequ run 2 chunk 22 crumb 38 normfirst 2070.000000
    host kiso run 2 chunk 22 crumb 123 normfirst 1502.400000
    host kiso run 2 chunk 5 crumb 92 normfirst 1526.400000
    host fell run 2 chunk 23 crumb 18 normfirst 2576.250000
    
    

Breaking up 20-08-10

  • Global performances
  • Outliers at 3 sigma
    CPU outliers
    host fell run 2 chunk 22 crumb 73 normcpu 3006.050000
    host fell run 2 chunk 22 crumb 38 normcpu 3609.431250
    host fell run 1 chunk 25 crumb 5 normcpu 4184.412500
    
    SWAP outliers
    host fell run 2 chunk 5 crumb 5 swap 2590.000000
    host boer run 2 chunk 5 crumb 108 swap 2586.000000
    host hequ run 2 chunk 5 crumb 6 swap 2589.000000
    host bali run 2 chunk 5 crumb 27 swap 2586.000000
    host yili run 2 chunk 5 crumb 84 swap 2549.000000
    host boer run 2 chunk 21 crumb 32 swap 2579.000000
    host boer run 2 chunk 22 crumb 71 swap 2585.000000
    host fell run 2 chunk 22 crumb 50 swap 2542.000000
    host fell run 2 chunk 22 crumb 73 swap 2581.000000
    host dole run 1 chunk 24 crumb 28 swap 2560.000000
    host dole run 1 chunk 24 crumb 19 swap 2583.000000
    host fell run 1 chunk 24 crumb 23 swap 2631.000000
    host hequ run 1 chunk 15 crumb 5 swap 2585.000000
    host fell run 1 chunk 25 crumb 46 swap 2625.000000
    host dole run 1 chunk 25 crumb 0 swap 2580.000000
    host dole run 1 chunk 17 crumb 18 swap 2557.000000
    host fell run 1 chunk 17 crumb 57 swap 2591.000000
    host dole run 1 chunk 17 crumb 91 swap 2584.000000
    host dole run 0 chunk 3 crumb 4 swap 2541.000000
    
    CAL2 outliers
    host fell run 2 chunk 22 crumb 73 normcal2 1301.250000
    host fell run 1 chunk 25 crumb 5 normcal2 1691.250000
    
    Tkr TREE outliers
    host dole run 2 chunk 23 crumb 18 normtree 939.622642
    host fell run 2 chunk 22 crumb 14 normtree 795.000000
    host fell run 2 chunk 22 crumb 38 normtree 2006.250000
    host kiso run 1 chunk 15 crumb 45 normtree 1012.800000
    host fell run 1 chunk 25 crumb 5 normtree 990.000000
    host fell run 1 chunk 25 crumb 10 normtree 862.500000
    
    Tkr Tree LINK outliers
    host fell run 2 chunk 5 crumb 99 normlink 386.250000
    host fell run 2 chunk 22 crumb 73 normlink 390.000000
    host fell run 1 chunk 25 crumb 5 normlink 540.000000
    
    Tkr FIRST outliers
    host dole run 2 chunk 23 crumb 18 normfirst 1386.792453
    host fell run 2 chunk 22 crumb 38 normfirst 2445.000000
    host kiso run 1 chunk 15 crumb 45 normfirst 1483.200000
    host fell run 1 chunk 25 crumb 5 normfirst 1882.500000