OpenMPI hangs on large message

This happens with (Open MPI) 4.1.1. To reproduce the problem, run below script with

mpirun -n 2 python test_largemsg.py
cat test_largemsg.py

from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
import numpy as np


n = 20000
if rank == 0:
    data = np.arange(n, dtype='i')
else:
    data = np.empty(n, dtype='i')
comm.Bcast(data, root=0)
print(f'rank={rank} data[-1]={data[-1]}')

Solution: suppress openmpi tcp protocol with the following command:

mpirun -n 2 --mca btl ^tcp python test_largemsg.py

Note: This is NOT an issue with srun.



  • No labels