Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

srun --partition=anagpu --ntasks=4 --ntasks-per-node=4 python ./test_mpi.py

Comparing Different Builds of openmpi on drp-srcf and slac sdf.

We investigate performance of how different builds of openmpi perform on drp-srcf and sdf nodes.

sdf:

1) openmpi 4.0.4 ('--prefix=/opt/openmpi-4.0.4' '--with-ucx' '--enable-mpi-cxx')

drp-srcf:

1) openmpi 4.1.1 (from ana-4.0.23-py3 with ucx)

2) openmpi 4.1.0 (from ps-4.3.2 w/o ucx)

Traffics were going between two nodes. On sdf, these are rome0253-0254 and on srcf eb011-010. Rates (GB/s) shown here are Maximum rate observed.

...

The above table show a little "too optimistic" rate that didn't account for actual data transfer time since it was timing just right after the Send command is done.

Below table shows a more realistic rate by timing the entire process of two ranks (on different nodes) - each perform a series of Send and Recv. 

...