Page History
Content
Table of Contents |
---|
previous page: Scaling behavior of psana1 - Part 1 - det.calib method in multicore processing with mpi
2024-02-06 Test of
...
milano216 host with
...
perf
...
Test description
...
title | command perf, response, and accumulation of results in dict |
---|---|
collapse | true |
...
Description
Using command:
perf stat -e
...
cache-references,cache-misses,cycles,instructions,branches,branch-misses,faults,migrations,page-faults,L1-dcache-load-misses,L1-icache-load-misses python test-scaling-subproc.py <parameter>
where parameter defines test for different number of CPUs, e.g. <parameter> = −1,−2,−8,−13,−16,−17,−18 stands for test on single, 8, 16, 32, 56, 64, 128 CPUs.
Code Block | ||||
---|---|---|---|---|
| ||||
import numpy as np from time import time, sleep def random_standard(shape=(40,60), mu=200, sigma=25, dtype=np.float64): a = mu + sigma*np.random.standard_normal(shape) return np.require(a, dtype) def random_arrays(sh2d = (8*512,1024), dtype=np.float64): sh3d = (3,) + sh2d return random_standard(shape=sh2d, mu=10, sigma=2, dtype=dtype),\ sleep 5 Performance counter stats for 'sleep 5': 27,322 cache-references:u 6,778 cache-misses:u # 24.808 % of all cache refs random_standard(shape=sh3d, mu=20, sigma=3, dtype=dtype) def time_consuming_algorithm(): t01 = time() a, b = random_arrays() t02 473,798= time() gr1 cycles:u= a>=11 gr2 = (a>9) & (a<11) gr3 = a<=9 t03 = time() a[gr1] -= b[0, gr1] a[gr2] -= b[1, gr2] a[gr3] -= b[2, gr3] t04 = time() return (t01, t02, t03, t04) |
Code Block | ||||
---|---|---|---|---|
| ||||
def do_algo(cpu=0, cmt='v0'): hostname 600,974= get_hostname() #cpu_num instructions:u= psutil.Process().cpu_num() print('requested cpu:%03d' % cpu) SAVE_FIGS #= True 1.27 SHOW_FIGS insn= perFalse cycle nevents = 100 ntpoints = 6 arrts 140,006 = np.zeros((nevents,ntpoints), dtype=np.float64) branches:u t05_old = time() for nevt in range(nevents): t00 = time() times = time_consuming_algorithm() cpu_num = psutil.Process().cpu_num() #if cpu_num >=16 and cpu_num <=23: # print('cpu_num:%03d nevt:%03d time:%.6f CPU_NUM 64IN WEKA RANGE [16,23]' % faults:u (cpu_num, nevt, dt_sec)) t05 = time() times = (t00,) + times + (t05,) arrts[nevt,:] = times dt_evt = t05 - t05_old t05_old = t05 if nevt%10>0: continue 0 migrations:u dt_alg = times[4] - times[3] dt_in = times[4] - times[1] print('cpu_num:%03d nevt:%03d times (sec)' % (cpu_num, nevt), \ ' random arrs: %.6f' % (times[2] - times[1]), \ 5.003108172 seconds time elapsed 0.000833000 seconds user ' indeces: %.6f' 0.000250000 seconds sys {'t_sec': 1707241305.4141793, 'cache-references': 27322, 'cache-misses': 6778, 'cycles': 473798, 'instructions': 600974, 'branches': 140006, 'faults': 64, 'migrations': 0} |
...
% (times[3] - times[2]), \
' alg: %.6f' % (times[4] - times[3]), \
' inside algo: %.6f' % (times[4] - times[1]), \
' per event: %.6f' % dt_evt)
...
further code is ffor saving results and graphics |
Results
Code Block | ||||
---|---|---|---|---|
| ||||
host: sdfmilan216 version: v32_1 test# t_sec ana-4.0.59-py3 [dubrovin@sdfmilan216:~/LCLS/con-py3]$ 1 CPU ====== Performance counter stats for 'python test-scaling-subproc.py -1': cache-references cache-misses cycles4,522,410,200 cache-references:u instructions branches faults migrations 000 0 27981 (62.49%) 7136 112,207,635 656920 cache-misses:u 602175 140104# 2.481 % of all cache refs 65 (62.51%) 224,402,878,245 0 001 cycles:u 5 27480 6806 575033 600309 140069 (62.51%) 428,582,543,872 65 instructions:u 0 002 # 1.91 insn per 10cycle 27791 (62.51%) 59,430,436,824 7056 branches:u 590487 601053 140020 66 0 003 15 (62.50%) 2,353,206,592 27533 branch-misses:u 6992 # 627866 3.96% of all branches 601394 140128 (62.50%) 66 657,277 faults:u 0 004 20 29888 7682 654153 602644 140029 66 0 migrations:u 0 005 25 28022 6871 656841 601179 140050 657,277 63 page-faults:u 0 006 30 28541 7349 601647 600316 2,169,783,808 140059 L1-dcache-load-misses:u 65 0 007 35 (62.50%) 28091 7,173,374 7098 L1-icache-load-misses:u 565066 600140 140009 64 0 008(62.50%) 70.762930452 seconds time elapsed 40 66.918003000 seconds user 27702 2.380196000 seconds sys 8 6705 698524 CPU ====== Performance counter stats for 'python test-scaling-subproc.py -2': 35,293,654,947 601910 cache-references:u 140023 63 0 009 45 (62.50%) 28209 675,772,563 6863 cache-misses:u 809092 # 600409 1.915 % of all 140083cache refs (62.50%) 1,863,835,416,629 65 cycles:u 0 010 50 28196 6726 526517 600063 (62.50%) 3,408,694,078,315 139994 instructions:u 63 # 1.83 insn per 0 011cycle 55(62.50%) 470,729,321,611 27951 branches:u 6896 600846 600959 140001 63 0(62.50%) 012 18,710,029,709 branch-misses:u 60 27028 # 3.97% of all 6796branches 595225 (62.50%) 600121 4,759,204 140003 faults:u 65 0 013 65 27907 6964 585121 600061 139995 0 migrations:u 65 0 014 70 27852 7030 657846 4,759,204 600064 page-faults:u 139995 64 0 015 75 28096 17,164,781,068 6913 L1-dcache-load-misses:u 774550 602619 140018 63 0 016(62.50%) 42,407,266 80 L1-icache-load-misses:u 27460 6835 582093 600069 139997 (62.50%) 82.107165073 64seconds time elapsed 600.726489000 seconds 0user 017 28.169314000 seconds sys 16 CPU ====== Performance counter 85stats for 'python test-scaling-subproc.py -8': 2761071,125,012,043 cache-references:u 6952 652613 602741 140051 65 (62.50%) 0 018 2,509,743,885 cache-misses:u 90 27861 # 3.529 % of all 7139cache refs 584477 (62.50%) 4,256,512,072,612 601047cycles:u 140018 67 0 019 95 28019 (62.50%) 6,815,210,853,848 7105 instructions:u 664649 600288 # 1.60 140052insn per cycle 68 (62.50%) 940,797,592,651 branches:u 0 020 100 27047 6732 615723 600422 140003 (62.50%) 37,401,077,277 66 branch-misses:u 0 021 # 3.98% of all branches 106 27820(62.50%) 70559,874,603 faults:u 622018 600129 140005 63 0 022 111 27162 6811 0 540837 migrations:u 600987 140012 67 0 023 116 28197 9,874,603 7050 page-faults:u 639233 602179 140108 69 0 024 34,764,585,133 121 28441L1-dcache-load-misses:u 7268 717646 600324 140074 (62.50%) 65 82,908,203 0 025L1-icache-load-misses:u 126 28351 7310 642671 (62.50%) 603474 98.180409648 seconds time elapsed 140026 1370.175346000 seconds user 121.864448000 67seconds sys 32 CPU ====== Performance counter stats for 'python 0 026test-scaling-subproc.py -13': 140,229,421,945 cache-references:u 131 27470 6933 676669 602621 140020 (62.50%) 5,022,345,750 65 cache-misses:u 0 027 # 3.582 136% of all cache refs 27506 (62.50%) 8,558,410,936,114 6944 cycles:u 628244 601155 140043 63 0 028 141 (62.50%) 13,628,360,184,584 instructions:u 27417 6906 # 1.59 618067 insn per cycle 601887 140017 (62.50%) 1,881,291,550,548 branches:u 65 0 029 146 27538 6807 632330 (62.50%) 74,783,808,615 602653 branch-misses:u 140031 # 67 3.98% of all branches 0 030 (62.50%) 151 19,579,143 faults:u 27489 6964 645969 602620 140019 64 0 031 156 0 26782 migrations:u 6631 456450 600397 139995 65 0 032 19,579,143 161 page-faults:u 27715 7114 478412 601065 140022 65 68,615,480,748 0 033L1-dcache-load-misses:u 166 27648 6862 666839 (62.50%) 600403 163,094,161 139997 L1-icache-load-misses:u 65 0 034 171 27790 (62.50%) 99.279801084 7116seconds time elapsed 2763.979749000 607178seconds user 246.852789000 600961seconds sys 56 CPU ====== Performance counter stats for 140003 'python test-scaling-subproc.py -16': 245,664,589,385 64 cache-references:u 0 035 176 28658 7119 853651 (62.50%) 6002415,986,128,102 cache-misses:u 140039 65# 2.437 % of all cache refs 0 036 (62.50%) 13,462,198,820,573 cycles:u 181 28039 7076 631105 600126 140005 66 (62.50%) 23,847,765,747,744 instructions:u 0 037 186# 1.77 insn per 27223cycle 6896 (62.50%) 3,290,927,488,525 609775 branches:u 603480 140025 64 0 038 191 (62.50%) 28072 130,897,170,304 branch-misses:u 6938 474044 # 6026223.98% of all branches 140021 (62.50%) 65 35,494,247 0 039faults:u 196 27164 6794 586250 600412 140000 66 0 040 migrations:u 201 27462 6949 608541 601192 140070 66 35,494,247 page-faults:u 0 041 207 27634 6773 563547 600079 119,933,873,577 140000 L1-dcache-load-misses:u 65 0 042 212 27950 (62.50%) 6816 288,403,921 804718 L1-icache-load-misses:u 603460 140019 64 0 043 (62.50%) 108.453630713 seconds 217time elapsed 5381.177612000 seconds user 27703 333.903330000 seconds sys 64 CPU ====== Performance counter 6876stats for 'python test-scaling-subproc.py -17': 692982281,639,175,978 cache-references:u 601054 140018 64 0 044 222 (62.50%) 272738,968,404,974 cache-misses:u 6816 545147 # 3.184 601051% of all cache refs 140018 65(62.50%) 16,140,364,752,053 cycles:u 0 045 227 27760 6789 722358 600128 (62.50%) 27,256,133,511,829 140004 instructions:u 64 # 1.69 insn 0 046per cycle 232 (62.50%) 3,761,710,111,186 27992 branches:u 6888 678596 601050 140017 63 0(62.50%) 047 149,569,155,086 branch-misses:u 237 27817 # 3.98% of all 7228branches 599381 (62.50%) 601153 39,148,442 140058 faults:u 65 0 048 242 27968 7140 696749 600546 140037 0 migrations:u 68 0 049 247 28587 7025 695330 39,148,442 600093 page-faults:u 140004 65 0 050 252 27502 137,584,278,754 6858 L1-dcache-load-misses:u 516984 603752 140108 66 0 051(62.50%) 330,750,296 257 L1-icache-load-misses:u 27653 7056 592800 603946 140169 (62.50%) 120.688547006 seconds 69time elapsed 6274.688233000 0 052seconds user 484.406164000 seconds sys 120 CPU ======= Performance 262counter stats for 'python test-scaling-subproc.py -18': 27727532,229,037,371 cache-references:u 6860 717016 601879 140013 63 (62.50%) 0 053 14,227,944,434 cache-misses:u 267 27980 # 2.673 % of all 6980cache refs 632274(62.50%) 29,404,359,241,173 cycles:u 600205 140025 65 0 054 272 28259 (62.50%) 51,095,884,028,391 7390 instructions:u 599365 601891# 1.74 insn 140018per cycle 65 (62.50%) 7,053,547,766,317 branches:u 0 055 277 27593 7039 695692 601887 140017(62.50%) 280,479,284,507 branch-misses:u 65 # 0 056 3.98% of all branches 282 (62.50%) 27422 73,250,012 6773 faults:u 482596 603483 140028 67 0 057 287 30600 0 6860 migrations:u 607175 602626 140022 64 0 058 292 73,250,012 26858 page-faults:u 6829 473289 600143 140009 65 0 059 260,078,672,869 L1-dcache-load-misses:u 297 27374 7000 474284 601880 (62.50%) 140014 618,858,635 64 L1-icache-load-misses:u 0 060 302 27349 6724 (62.50%) 119.736692035 511743seconds time elapsed 11628.275939000 seconds 601179user 843.423292000 140050 63 0 061 307 27567 6694 509492 602773 140060 65 0 062 313 27568 6969 500050 600102 140009 68 0 063 318 27293 6847 507337 600531 140029 63 0 064 323 27894 6883 505431 600522 140027 64 0 065 328 27367 6821 495425 602764 140057 64 0 066 333 28630 7033 611191 601177 140065 65 0 067 338 27667 6999 573806 600059 139993 63 0 068 343seconds sys |
Summary
number of CPU | cache- references | cache- misses | cycles | instructions | branches | branch- misses | faults | page-faults | L1-dcache- load-misses | L1-icache- load-misses | L1-icache Ratio N/1 | cmt |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 4,522,410,200 | 112,207,635 | 224,402,878,245 | 428,582,543,872 | 59,430,436,824 | 2,353,206,592 | 657,277 | 657,277 | 2,169,783,808 | 7,173,374 | 1 | |
8 | 35,293,654,947 | 675,772,563 | 18,710,029,709 | 17,164,781,068 | 42,407,266 | 5.9 | ||||||
16 | 71,125,012,043 | 2,509,743,885 | 37,401,077,277 | 34,764,585,133 | 82,908,203 | 11.6 | ||||||
32 | 140,229,421,945 | 5,022,345,750 | 74,783,808,615 | 68,615,480,748 | 163,094,161 | 22.7 | ||||||
56 | 245,664,589,385 | 5,986,128,102 | 130,897,170,304 | 119,933,873,577 | 288,403,921 | 40.2 | ||||||
64 | 281,639,175,978 | 8,968,404,974 | 149,569,155,086 | 137,584,278,754 | 330,750,296 | 46.1 | ||||||
120 | 532,229,037,371 | 14,227,944,434 | 29,404,359,241,173 | 51,095,884,028,391 | 7,053,547,766,317 | 280,479,284,507 | 73,250,012 | 73,250,012 | 260,078,672,869 | 618,858,635 | 86.2 | |
2024-02-07 Test of milano216 host with command perf
Description
Running perf with mpirun on a single and 80 CPUs:
perf stat -e cache-references,cache-misses,cycles,instructions,branches,branch-misses,faults,migrations,page-faults,L1-dcache-load-misses,L1-icache-load-misses,dTLB-load-misses,iTLB-load-misses mpirun -n 1 python Detector/examples/test-scaling-mpi.py
perf stat -e cache-references,cache-misses,cycles,instructions,branches,branch-misses,faults,migrations,page-faults,L1-dcache-load-misses,L1-icache-load-misses,dTLB-load-misses,iTLB-load-misses mpirun -n 80 python Detector/examples/test-scaling-mpi.py
Code Block | ||||
---|---|---|---|---|
| ||||
import numpy as np from time import time def random_standard(shape=(40,60), mu=200, sigma=25, dtype=np.float64): a = mu + sigma*np.random.standard_normal(shape) return np.require(a, dtype) def random_arrays(sh2d = (8*512,1024), dtype=np.float64): sh3d = (3,) + sh2d return random_standard(shape=sh2d, mu=10, sigma=2, dtype=dtype),\ random_standard(shape=sh3d, mu=20, sigma=3, dtype=dtype) def time_consuming_algorithm(): a, b = random_arrays() gr1 = a>=11 gr2 = (a>9) & (a<11) gr3 = a<=9 t0_sec = time() a[gr1] -= b[0, gr1] a[gr2] -= b[1, gr2] a[gr3] -= b[2, gr3] return time() - t0_sec |
Code Block | ||||
---|---|---|---|---|
| ||||
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
hostname = get_hostname()
cpu_num = psutil.Process().cpu_num()
print('rank:%02d cpu_num:%03d size:%02d' % (rank, cpu_num, size))
ranks = (0, 10, 20, 30, 40, 50, 60, 70)
SAVE_FIGS = True
SHOW_FIGS = False
nevents = 100
arrts = np.zeros((nevents, size), dtype=np.float64)
for nevt in range(nevents):
dt_sec = time_consuming_algorithm()
arrts[nevt,rank] = dt_sec # dt_sec = time()-t0_sec
cpu_num = psutil.Process().cpu_num()
if cpu_num >=16 and cpu_num <=23:
print('rank:%02d cpu_num:%03d nevt:%03d time:%.6f CPU_NUM IN WEKA RANGE [16,23]' % (rank, cpu_num, nevt, dt_sec))
if nevt%10>0: continue
print('rank:%02d cpu_num:%03d nevt:%03d time:%.6f' % (rank, cpu_num, nevt, dt_sec))
...
somme graphics for array arrts |
Results
Code Block | ||||
---|---|---|---|---|
| ||||
ana-4.0.59-py3 [dubrovin@sdfmilan216:~/LCLS/con-py3]$ perf stat -e cache-references,cache-misses,cycles,instructions,branches,branch-misses,faults,migrations,page-faults,L1-dcache-load-misses,L1-icache-load-misses,dTLB-load-misses,iTLB-load-misses mpirun -n 1 python Detector/examples/test-scaling-mpi.py ... Performance counter stats for 'mpirun -n 1 python Detector/examples/test-scaling-mpi.py': 4,448,830,552 cache-references:u 27957 7049 531007 603732 140102 65 0 069 348 27663 6984 497372 602742 140052 66 0 070 353 27757 7122 467605 603782 140117 66 0 071 358 27562 6901 494848 600064 139995 64 0 072 363 (50.00%) 27958 90,374,312 6891 cache-misses:u 494870 601182 # 2.031 140053% of all cache refs (50.00%) 65 222,814,516,280 cycles:u 0 073 368 27386 6802 486891 600959 140001 (50.02%) 426,700,282,993 63instructions:u 0 074# 1.92 insn per cycle 373 27759 (50.01%) 58,876,394,584 6793 branches:u 489091 601054 140018 63 0 075 378(50.01%) 2,343,687,188 27784 branch-misses:u 6962 # 513506 3.98% of all branches 603481 140026 (50.01%) 65 635,183 faults:u 0 076 383 27924 7059 604907 600396 139994 64 0 migrations:u 0 077 388 27703 7133 640672 601881 140015 635,183 65 page-faults:u 0 078 393 28245 6975 735747 603941 2,158,358,417 140164 L1-dcache-load-misses:u 64 0 079 398 28220 (50.00%) 7206 5,694,036 646143L1-icache-load-misses:u 603482 140027 66 0 080 (49.99%) 403 4,282,821 27845dTLB-load-misses:u 7027 523197 601913 140020 64 (49.99%) 0 081 890,671 iTLB-load-misses:u 408 27938 6996 541366 600261 140057 (50.00%) 6573.297275789 seconds time elapsed 69.795728000 seconds 0user 082 2.318007000 seconds 413 sys ana-4.0.59-py3 [dubrovin@sdfmilan216:~/LCLS/con-py3]$ perf stat -e cache-references,cache-misses,cycles,instructions,branches,branch-misses,faults,migrations,page-faults,L1-dcache-load-misses,L1-icache-load-misses,dTLB-load-misses,iTLB-load-misses 27288 mpirun -n 80 python Detector/examples/test-scaling-mpi.py ... Performance counter 6721stats for 'mpirun -n 80 python 498737 Detector/examples/test-scaling-mpi.py': 349,526,509,383 602620 cache-references:u 140019 64 0 083 419 (50.01%) 28326 5,932,480,814 cache-misses:u 7148 553793 # 603764 1.697 % of all cache refs 140113 (50.00%) 18,768,444,974,036 68cycles:u 0 084 424 27817 6753 600135 601190 (50.00%) 33,983,153,714,284 140055instructions:u 66 # 1.81 insn per cycle 0 085 429(49.99%) 4,684,730,635,234 branches:u 27297 6725 620434 600960 140002 64 (49.99%) 0 086186,649,297,019 branch-misses:u 434 # 28423 3.98% of all branches 6916 (50.00%) 619236 60262952,121,421 faults:u 140022 64 0 087 439 27316 6827 558557 600240 0 140038 migrations:u 64 0 088 444 28744 7059 822160 52,121,421 600188 page-faults:u 140021 66 0 089 449 27504 171,500,392,922 6877L1-dcache-load-misses:u 524623 600069 139997 65 (50.00%) 0 090 267,672,856 L1-icache-load-misses:u 454 27836 6895 607227 601902 140022 65 0 091(50.00%) 339,145,247 459 dTLB-load-misses:u 27343 6923 628340 602317 140142 (50.01%) 65 69,780,394 0 092iTLB-load-misses:u 464 29224 7100 728467 602632 (50.01%) 140025 92.952500273 seconds time elapsed 6501.353593000 seconds 68user 410.844719000 0 093 469 27495 6731 495067 603461 140020 65 0 094 474 28429 7192 642561 602654 140032 67 0 095 479 27707 6896 592434 600319 140062 69 0 096 484 27315 6806 698661 600511 140026 67 0 097 489 28164 6934 611405 601042 140016 65 0 098 494 28558 7297 647084 601042 140016 66 0 099 499 27607 6832 640796 600398 139996 66 0 |
Code Block | ||||
---|---|---|---|---|
| ||||
host: sdfmilan216 version: v64
test# t_sec cache-references cache-misses cycles instructions branches faults migrations
000 0 27807 6624 729494 603490 140028 64 0
001 5 28247 7140 594354 603471 140023 64 0
002 10 27530 6794 526625 600130 140006 66 0
003 15 28074 7134 584615 603463 140022 67 0
004 20 27226 6765 568909 601163 140061 65 0
005 25 27563 6878 608405 603761 140110 65 0
006 30 27623 6969 585092 602011 140048 63 0
007 35 28016 7208 584639 600074 139998 64 0
008 40 27379 6883 557356 600535 140032 65 0
009 45 28089 7155 679594 600533 140030 63 0
010 50 27421 6909 546743 603470 140022 64 0
011 55 27773 6839 672210 601042 140016 66 0
012 60 27316 6914 983778 603461 140020 64 0
013 65 29078 7103 1105556 600407 139998 65 0
014 70 27415 6885 1289278 601889 140019 67 0
015 75 27235 6850 596144 602625 140021 65 0
016 80 27224 6844 595918 600335 140078 67 0
017 85 27391 6826 545500 601040 140014 64 0
018 90 27753 6913 555237 600400 139995 64 0
019 96 27512 6800 546790 600342 140079 66 0
020 101 27818 6651 588694 602190 140109 66 0
021 106 27745 6969 603165 603460 140019 63 0
022 111 27706 6790 553593 603471 140023 65 0
023 116 27221 6801 632050 600059 139993 63 0
024 121 27708 6824 545754 602640 140025 64 0
025 126 27290 6783 533678 600069 139997 65 0
026 131 27309 6880 554409 600977 140009 67 0
027 136 28198 6748 675396 602626 140022 66 0
028 141 27231 6980 1156713 602754 140054 63 0
029 146 27808 6755 668027 601393 140127 64 0
030 151 27487 6743 618176 600314 140057 64 0
031 156 27511 6734 610589 600360 140074 64 0
032 161 27664 6937 519508 600397 139995 65 0
033 166 27473 6802 583479 601179 140067 67 0
034 171 27273 6761 592098 600124 140003 64 0
035 176 27060 6672 533480 600397 139995 65 0
036 181 27516 6836 836405 600068 139996 64 0
037 186 26990 6592 519081 600135 140007 65 0
038 192 27087 6599 578868 600395 139993 63 0
039 197 27208 6650 502553 601884 140018 67 0
040 202 27340 6935 513824 600418 140002 66 0
041 207 27134 6706 535601 600508 140023 64 0
042 212 27640 6902 623607 600078 139999 64 0
043 217 27683 7008 696304 602742 140052 65 0
044 222 27462 6751 502284 602629 140022 65 0
045 227 27673 6964 693908 600078 139999 64 0
046 232 27886 7068 663126 600764 140102 64 0
047 237 27642 7073 601080 601041 140015 65 0
048 242 27093 6793 491163 600176 140017 65 0
049 247 27628 6886 557406 603461 140020 65 0
050 252 27014 6783 521803 600121 140003 65 0
051 257 27689 6783 571768 603483 140028 67 0
052 262 27850 6952 668413 603464 140023 67 0
053 267 27859 6789 628276 601903 140023 67 0
054 272 27749 6733 584079 603621 140068 65 0
055 277 28189 7124 569216 601049 140017 65 0
056 282 27601 6917 574187 603782 140117 66 0
057 287 27928 6848 536114 600985 140010 65 0
058 293 27228 6742 597382 600396 139994 64 0
059 298 27858 6830 626275 600418 140002 66 0
060 303 27106 6749 592436 600225 140030 66 0
061 308 27146 6908 1148965 601180 140068 67 0
062 313 26963 6688 578839 600975 140007 65 0
063 318 29770 6771 574128 603462 140021 65 0
064 323 27643 6972 539701 600999 140011 66 0
065 328 28824 6874 655647 601880 140014 64 0
066 333 27006 6751 574385 600121 140003 65 0
067 338 27562 6912 1133050 602740 140050 64 0
068 343 27802 7111 672730 602625 140021 65 0
069 348 28163 7089 1065588 601900 140023 66 0
070 353 27878 6874 695158 600071 139999 67 0
071 358 27010 6839 538832 603493 140031 67 0
072 363 27371 6882 536044 602622 140021 66 0
073 368 27421 6742 519800 600970 140005 64 0
074 373 26952 6766 943875 601155 140043 63 0
075 378 27974 6979 978668 600529 140030 65 0
076 383 27233 6846 969595 600305 140068 65 0
077 388 27435 6864 1002256 601190 140055 65 0
078 394 26603 6687 512388 600544 140035 66 0
079 399 27703 7229 826644 600334 140064 66 0
080 404 30357 7076 697640 600456 140012 68 0
081 409 27593 7014 542822 600409 140000 67 0
082 414 28354 7250 732640 601181 140069 68 0
083 419 27387 7071 563924 601176 140064 64 0
084 424 27977 7275 629946 601886 140016 64 0
085 429 27340 6741 575039 602920 140103 64 0
086 434 27795 7084 619973 602620 140019 64 0
087 439 27463 6668 557690 600975 140007 65 0
088 444 28083 7166 663865 600961 140003 65 0
089 449 27561 6870 589958 600077 139998 63 0
090 454 27670 7053 616345 601883 140017 67 0
091 459 27216 6903 564047 600962 140004 65 0
092 464 26729 6814 1162124 600065 139996 65 0
093 469 27246 6698 467172 600068 139996 64 0
094 474 27543 6816 487590 600127 140006 67 0
095 479 27374 6754 526273 600068 139996 64 0
096 485 28285 7204 661196 600094 140005 66 0
097 490 27550 6901 578701 602648 140030 67 0
098 495 28363 7337 649845 600533 140031 64 0
099 500 27322 6778 473798 600974 140006 64 0 |
Code Block | ||||
---|---|---|---|---|
| ||||
host: sdfmilan216 version: v120
test# t_sec cache-references cache-misses cycles instructions branches faults migrations
000 0 27931 6807 740397 603462 140021 65 0
001 5 29421 7303 889000 603481 140026 65 0
002 10 28498 7097 754015 601047 140018 66 0
003 15 27902 6810 624262 602619 140018 63 0
004 20 27691 7099 643281 603499 140028 67 0
005 25 26869 6721 613941 600397 139995 65 0
006 30 28939 6813 607238 601888 140018 66 0
007 35 27753 6898 601369 600302 140068 66 0
008 40 28215 6828 560459 600186 140019 64 0
009 45 29477 6852 663561 600537 140034 65 0
010 50 27399 6776 565464 600208 140028 68 0
011 55 27550 6890 638548 600971 140006 65 0
012 60 27572 6834 572234 600249 140041 65 0
013 65 28684 6880 633639 601042 140016 65 0
014 70 27078 6702 577830 602621 140020 65 0
015 75 27944 6998 629551 601880 140014 64 0
016 80 27577 7039 589876 601055 140019 64 0
017 85 28001 6929 603365 600985 140010 65 0
018 90 27183 6926 621575 602631 140024 66 0
019 95 28359 6770 590508 600395 139993 63 0
020 100 27652 6960 595630 603471 140023 65 0
021 106 30539 6764 1159692 600961 140003 65 0
022 112 27242 7972 1147605 600962 140004 66 0
023 118 27991 7139 1206382 601045 140016 65 0
024 123 26929 7209 1520627 603480 140025 64 0
025 129 27643 7187 1183727 600399 139997 66 0
026 135 26617 6810 1351490 600961 140003 64 0
027 141 26366 8105 1419415 600404 139998 66 0
028 146 23734 8431 1063626 601897 140020 65 0
029 152 24920 8596 1202865 602163 140100 65 0
030 158 28609 8131 1380905 601883 140017 66 0
031 163 25033 8436 1031466 600062 139996 66 0
032 169 26950 6785 1238519 603482 140027 66 0
033 175 27594 6894 1036822 601891 140018 65 0
034 181 31002 6904 1043149 600961 140003 64 0
035 186 31024 6777 1041242 600065 139996 65 0
036 192 26870 6779 1134167 600075 139999 65 0
037 197 28381 7076 917260 600619 140064 64 0
038 202 27976 6872 599000 603471 140023 65 0
039 207 28855 7121 702819 602764 140057 64 0
040 212 27711 6912 520624 600323 140059 63 0
041 217 27061 6981 541981 600143 140009 65 0
042 222 27226 6645 575722 602641 140026 65 0
043 227 27115 6713 525871 600127 140006 66 0
044 232 27806 6832 1065500 600127 140003 63 0
045 237 27357 6856 1065612 600060 139994 64 0
046 242 26999 6811 1046968 600975 140007 65 0
047 247 27767 7046 545133 600397 139995 65 0
048 253 27546 6946 839604 600063 139997 66 0
049 258 27782 6750 866929 602632 140025 68 0
050 263 27729 7086 726946 600410 140001 68 0
051 268 27340 6826 976370 602623 140022 66 0
052 273 27450 6646 872728 600120 140002 63 0
053 278 27269 6861 860563 601043 140017 66 0
054 283 27358 6871 1017536 600314 140057 63 0
055 288 27165 6832 805321 601880 140014 63 0
056 293 27455 6875 926200 602652 140030 65 0
057 298 27200 6704 909596 602317 140142 65 0
058 303 27761 6980 859384 602775 140066 65 0
059 308 27801 7020 682316 600065 139996 65 0
060 313 27593 6991 902030 600413 140001 67 0
061 318 26614 6645 792663 600187 140020 65 0
062 323 27427 6647 1081342 602633 140026 68 0
063 328 27047 6730 868243 600063 139997 67 0
064 333 27317 6857 894792 601157 140045 65 0
065 338 28685 6993 608994 600524 140029 64 0
066 344 24688 8446 1258195 603782 140117 65 0
067 350 25507 7527 1314785 602741 140051 64 0
068 355 26946 6880 1288179 603461 140020 65 0
069 361 30221 6737 834593 602622 140021 65 0
070 367 24845 7983 1305169 600975 140007 64 0
071 372 25235 7542 1210487 602641 140026 65 0
072 378 27787 7044 1113910 600413 140001 67 0
073 384 25338 8136 1221054 601050 140018 65 0
074 390 24565 8029 1299879 600327 140063 66 0
075 395 27326 6989 1123870 600334 140064 66 0
076 401 25591 9659 1485770 602636 140025 65 0
077 407 25937 9379 1242144 602766 140059 65 0
078 412 25349 8464 1118847 601049 140016 63 0
079 418 26772 7380 1246516 600764 140102 63 0
080 424 27522 6988 1671984 600214 140027 64 0
081 430 27465 8689 1154090 601044 140015 64 0
082 435 30807 6815 1084313 601900 140020 64 0
083 440 28580 7260 597202 600399 139997 67 0
084 445 28788 7039 767798 601898 140021 66 0
085 450 28948 7032 733130 603463 140022 65 0
086 455 28059 7309 536819 602625 140021 65 0
087 460 27062 6525 630170 601053 140017 63 0
088 466 27254 6773 549007 602622 140021 66 0
089 471 26982 6652 589272 601042 140016 66 0
090 476 26762 6858 1026028 600418 140002 66 0
091 481 27963 7002 658324 601169 140050 66 0
092 486 27989 7013 605852 601891 140018 65 0
093 491 28111 7131 611524 603493 140031 67 0
094 496 27635 7109 598087 602631 140024 67 0
095 501 27312 6889 596459 600071 139999 67 0
096 506 28674 7153 589621 600125 140004 65 0
097 511 27867 7041 603441 601893 140020 67 0
098 516 27463 6819 493625 600067 139998 66 0
099 521 28909 7355 672193 603462 140021 65 0 |
References
seconds sys
|
Summary
number of mpi cores | cache- references | cache- misses | cycles | instructions | branches | branch- misses | faults | page-faults | L1-dcache- load-misses | L1-icache- load-misses | dTLB- load-misses | iTLB- load-misses | cmt |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 4,448,830,552 | 90,374,312 | 222,814,516,280 | 426,700,282,993 | 58,876,394,584 | 2,343,687,188 | 635,183 | 635,183 | 2,158,358,417 | 5,694,036 | 4,282,821 | 890,671 | |
80 | 349,526,509,383 | 5,932,480,814 | 18,768,444,974,036 | 33,983,153,714,284 | 4,684,730,635,234 | 186,649,297,019 | 52,121,421 | 52,121,421 | 171,500,392,922 | 267,672,856 | 339,145,247 | 69,780,394 | |
Ratio (80)/(1) | 79.4 | 65.7 | 84.1 | 79.6 | 79.5 | 79.7 | 82.0 | 82.0 | 79.3 | 47.0 | 79.2 | 78.4 |
2024-02-09 Test of milano216 host with command perf
Description
Use commands with changed list of counters like
perf stat -e stalled-cycles-backend,stalled-cycles-frontend,ls_l1_d_tlb_miss.all,l1_dtlb_misses,l1_data_cache_fills_all,bp_l1_tlb_miss_l2_tlb_miss.if2m,bp_l1_tlb_miss_l2_tlb_miss,l2_dtlb_misses,l2_itlb_misses python test-scaling-subproc.py -8
Convert perf output to dict, present results in table.
Summary
(*) CPU numbers excludes weka FS.
number of CPU | stalled-cycles-backend | ←Ratio N/1 | stalled-cycles-frontend | ←Ratio N/1 | ls_l1_d_tlb_ miss.all | ←Ratio N/1 | l1_dtlb_ misses | ←Ratio N/1 | l1_data_cache_ fills_all | ←Ratio N/1 | bp_l1_tlb_miss _l2_tlb_miss.if2m | ←Ratio N/1 | bp_l1_tlb_miss_ l2_tlb_miss | ←Ratio N/1 | l2_dtlb_ misses | ←Ratio N/1 | l2_itlb_misses | ←Ratio N/1 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 143,828614 | 1 | 230,987724 | 1 | 33,227437 | 1 | 32,845193 | 1 | 2179,469714 | 1 | 3,701 | 1 | 769,309 | 1 | 4,833384 | 1 | 719,026 | 1 |
8 | 2105,881833 | 15 | 3421,108359 | 15 | 172,779030 | 5.2 | 173,508212 | 5.3 | 18216,564874 | 8.3 | 25,606 | 6.9 | 6124,897 | 7.9 | 31,719300 | 6.5 | 5591,821 | 7.8 |
16 | 8796,313234 | 61 | 8018,691890 | 35 | 327,892753 | 9.9 | 326,337183 | 9.9 | 34551,341060 | 15.8 | 55,331 | 14.8 | 12467,976 | 16 | 68,227221 | 14 | 10605,352 | 14.7 |
24* | 10413,149941 | 72 | 10519,490870 | 46 | 491,673248 | 14.8 | 490,566093 | 14.9 | 51539,384297 | 23.6 | 78,433 | 21 | 17889,621 | 23 | 96,922469 | 20 | 15177,116 | 21.1 |
32 | 17251,055297 | 120 | 13858,554955 | 60 | 671,047247 | 20.2 | 666,230997 | 20.3 | 68736,842168 | 31.5 | 105,874 | 29 | 23936,978 | 31 | 135,322250 | 28 | 21599,940 | 30.0 |
56* | 17892,504080 | 124 | 24120,493158 | 104 | 1136,778538 | 34.2 | 1135,448325 | 34.6 | 120696,775952 | 55.3 | 178,082 | 48 | 42679,843 | 55 | 234,498254 | 48 | 38164,171 | 53 |
64 | 27304,844238 | 190 | 27697,522017 | 120 | 1258,999729 | 37.9 | 1258,031354 | 38.3 | 141469,109046 | 64.9 | 201,330 | 54 | 50957,218 | 66 | 258,609632 | 53 | 43825,042 | 60.9 |
120* | 45388,735746 | 316 | 46279,264661 | 200 | 2382,065820 | 71.6 | 2376,507106 | 73.3 | 264016,453328 | 121 | 375,699 | 102 | 93410,817 | 121 | 488,308155 | 101 | 78261,952 | 109 |