We investigate performance of how different builds of openmpi perform on drp-srcf and sdf nodes.

sdf:

1) openmpi 4.0.4 ('--prefix=/opt/openmpi-4.0.4' '--with-ucx' '--enable-mpi-cxx')

drp-srcf:

1) openmpi 4.1.1 (from ana-4.0.23-py3 with ucx)

2) openmpi 4.1.0 (from ps-4.3.2 w/o ucx)

Traffics were going between two nodes. On sdf, these are rome0253-0254 and on srcf eb011-010. Rates (GB/s) shown here are Maximum rate observed.

Send Size (MB)sdf openmpi 4.0.4 w ucx (GB/s)srcf openmpi 4.1.1 w ucx (GB/s)srcf openmpi 4.1.0 w/o ucx (GB/s)
0.18.241.962.01
16.816.095.85
107.613.173.16
1003.732.382.35
10004.262.231.97

The above table show a little "too optimistic" rate that didn't account for actual data transfer time since it was timing just right after the Send command is done.

Below table shows a more realistic rate by timing the entire process of two ranks (on different nodes) - each perform a series of Send and Recv. 

Send Size (MB)sdf openmpi 4.0.4 w ucx (GB/s)srcf openmpi 4.1.1 w ucx (GB/s)srcf openmpi 4.1.0 w/o ucx (GB/s)
0.18.731.21.1
110.62.02.0
1012.22.42.3
10012.32.62.6
100012.32.32.3
  • No labels