Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
titleResults for perf with mpirun on a single and 80 CPUs
collapsetrue
ana-4.0.59-py3 [dubrovin@sdfmilan216:~/LCLS/con-py3]$ perf stat -e cache-references,cache-misses,cycles,instructions,branches,branch-misses,faults,migrations,page-faults,L1-dcache-load-misses,L1-icache-load-misses,dTLB-load-misses,iTLB-load-misses     mpirun -n 1 python  Detector/examples/test-scaling-mpi.py 
...
Performance counter stats for 'mpirun -n 1 python Detector/examples/test-scaling-mpi.py':

     4,448,830,552      cache-references:u                                            (50.00%)
        90,374,312      cache-misses:u            #    2.031 % of all cache refs      (50.00%)
   222,814,516,280      cycles:u                                                      (50.02%)
   426,700,282,993      instructions:u            #    1.92  insn per cycle           (50.01%)
    58,876,394,584      branches:u                                                    (50.01%)
     2,343,687,188      branch-misses:u           #    3.98% of all branches          (50.01%)
           635,183      faults:u                                                    
                 0      migrations:u                                                
           635,183      page-faults:u                                               
     2,158,358,417      L1-dcache-load-misses:u                                       (50.00%)
         5,694,036      L1-icache-load-misses:u                                       (49.99%)
         4,282,821      dTLB-load-misses:u                                            (49.99%)
           890,671      iTLB-load-misses:u                                            (50.00%)

      73.297275789 seconds time elapsed

      69.795728000 seconds user
       2.318007000 seconds sys

ana-4.0.59-py3 [dubrovin@sdfmilan216:~/LCLS/con-py3]$ perf stat -e cache-references,cache-misses,cycles,instructions,branches,branch-misses,faults,migrations,page-faults,L1-dcache-load-misses,L1-icache-load-misses,dTLB-load-misses,iTLB-load-misses     mpirun -n 80 python  Detector/examples/test-scaling-mpi.py
...
 Performance counter stats for 'mpirun -n 80 python Detector/examples/test-scaling-mpi.py':

   349,526,509,383      cache-references:u                                            (50.01%)
     5,932,480,814      cache-misses:u            #    1.697 % of all cache refs      (50.00%)
18,768,444,974,036      cycles:u                                                      (50.00%)
33,983,153,714,284      instructions:u            #    1.81  insn per cycle           (49.99%)
 4,684,730,635,234      branches:u                                                    (49.99%)
   186,649,297,019      branch-misses:u           #    3.98% of all branches          (50.00%)
        52,121,421      faults:u                                                    
                 0      migrations:u                                                
        52,121,421      page-faults:u                                               
   171,500,392,922      L1-dcache-load-misses:u                                       (50.00%)
       267,672,856      L1-icache-load-misses:u                                       (50.00%)
       339,145,247      dTLB-load-misses:u                                            (50.01%)
        69,780,394      iTLB-load-misses:u                                            (50.01%)

      92.952500273 seconds time elapsed

    6501.353593000 seconds user
     410.844719000 seconds sys

Summary

number

of mpi cores

cache-

references

cache-

misses

cyclesinstructionsbranches

branch-

misses

faultspage-faults

L1-dcache-

load-misses

L1-icache-

load-misses

dTLB-

load-misses

iTLB-

load-misses

cmt
14,448,830,55290,374,312222,814,516,280426,700,282,99358,876,394,5842,343,687,188635,183635,1832,158,358,4175,694,0364,282,821890,671
80349,526,509,3835,932,480,81418,768,444,974,03633,983,153,714,2844,684,730,635,234186,649,297,01952,121,42152,121,421171,500,392,922267,672,856339,145,24769,780,394
Ratio (80)/(1)79.465.784.179.679.579.782.082.079.347.079.278.4


References