Page History
OpenMPI hangs on large message
This happens with (Open MPI) 4.1.1. To reproduce the problem, run below script with
mpirun -n 2 python test_largemsg.py
cat test_largemsg.py
Code Block | ||
---|---|---|
| ||
from mpi4py import MPI |
...
comm = MPI.COMM_WORLD |
...
rank = comm.Get_rank() |
...
size = comm.Get_size() |
...
import numpy as np |
...
n = 20000 |
...
if rank == 0: |
...
data = np.arange(n, dtype='i') |
...
else: |
...
data = np.empty(n, dtype='i') |
...
comm.Bcast(data, root=0) |
...
print(f'rank={rank} data[-1]={data[-1]}') |
Solution: suppress openmpi tcp protocol with the following command:
mpirun -n 2 --mca btl ^tcp python test_largemsg.py
Note: This is NOT an issue with srun.
Overview
Content Tools