We investigate performance of how different builds of openmpi perform on drp-srcf and sdf nodes.
sdf:
1) openmpi 4.0.4 ('--prefix=/opt/openmpi-4.0.4' '--with-ucx' '--enable-mpi-cxx')
drp-srcf:
1) openmpi 4.1.1 (from ana-4.0.23-py3 with ucx)
2) openmpi 4.1.0 (from ps-4.3.2 w/o ucx)
Traffics were going between two nodes. On sdf, these are rome0253-0254 and on srcf eb011-010. Rates (GB/s) shown here are Maximum rate observed.
Send Size (MB) | sdf openmpi 4.0.4 w ucx (GB/s) | srcf openmpi 4.1.1 w ucx (GB/s) | srcf openmpi 4.1.0 w/o ucx (GB/s) |
---|---|---|---|
0.1 | 8.24 | 1.96 | 2.01 |
1 | 6.81 | 6.09 | 5.85 |
10 | 7.61 | 3.17 | 3.16 |
100 | 3.73 | 2.38 | 2.35 |
1000 | 4.26 | 2.23 | 1.97 |
The above table show a little "too optimistic" rate that didn't account for actual data transfer time since it was timing just right after the Send command is done.
Below table shows a more realistic rate by timing the entire process of two ranks (on different nodes) - each perform a series of Send and Recv.
Send Size (MB) | sdf openmpi 4.0.4 w ucx (GB/s) | srcf openmpi 4.1.1 w ucx (GB/s) | srcf openmpi 4.1.0 w/o ucx (GB/s) |
---|---|---|---|
0.1 | 8.73 | 1.2 | 1.1 |
1 | 10.6 | 2.0 | 2.0 |
10 | 12.2 | 2.4 | 2.3 |
100 | 12.3 | 2.6 | 2.6 |
1000 | 12.3 | 2.3 | 2.3 |
Overview
Content Tools