Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

To achieve a MHz rate, the heavy-lifting core SMD0 (MPI rank 0) needs to be assigned on an individual node exclusively. This will allow the entire node to read smalldata in parallel at the full bandwidth.  To assign a specific node to SMD0, run 

Code Block
#!/bin/bash

#SBATCH --partition=anagpu
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=3
#SBATCH --output=%j.log


# setup node configurations
./setup_nodes.sh


# "-u" flushes print statements which can otherwise be hidden if mpi hangs
# "-m mpi4py.run" allows mpi to exit if one rank has an exception
mpirun python -u -m mpi4py.run /reg/g/psdm/tutorials/examplePython/mpiDataSource.py

2. Open MPI failed to TCP connect

We observe that when running using more than a few nodes (> 20) and that nodes assigned to your job are from mixed pools (cmp, mon, or eb nodes mixed in), jobs may fail with the following message:

Code Block
--------------------------------------------------------------------------

...


WARNING: Open MPI failed to TCP connect to a peer MPI process.  This

...


should not happen.

...



Your Open MPI job may now hang or fail.

...


  Local host: drp-srcf-eb010

...


  PID:        130465

...


  Message:    connect() to 172.21.164.90:1055

...

 failed
  Error     Operation now in progress (115)

...


--------------------------------------------------------------------------

This happens because we have multiple interfaces (tcp, ib, etc.) assigned to a node, however some might not have all enabled and that results in no connection when a node is missing that selected interface.

Solution

Prior to running your job, apply either 1) or 2).

1. Restrict to 172.21.164 subnet 

Code Block

...

export OMPI_MCA_btl_tcp_if_include=172.21.164.90/1072

2. Exclude 172.21.152 subnet

Code Block
export OMPI_MCA_btl_tcp_if_exclude=eno1