for openmpi>4 one has to explicitly allow openib (see bsub command below).  there was some discussion of it wanting to use "UCX" by default, which seems to be a libfabric competitor?

#./configure --prefix=`pwd`/install --with-lsf=/afs/slac/package/lsf/curr/ --with-lsf-libdir=/afs/slac/package/lsf/curr/lib/ --enable-mpi-cxx --with-verbs
make -j 10
make install

  • mpi4py recipe creates an activation script that sets PATH/LD_LIBRARY_PATH to appropriate openmpi in /reg/common/package/openmpi/
  • think the above was done (instead of conda package) so we could use one mpi4py with both rhel6/rhel7 openmpi versions, but not certain

this bsub command seems to work:

(ps-2.0.8) psanagpu109:lcls2$ bsub -R "span[ptile=1]" -q psanaq -o %J.log -n 4 m
pirun --mca btl_openib_allow_ib 1 python junk.py

(ps-2.0.8) psanagpu109:lcls2$ more junk.py
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()

from psana import DataSource
import numbers
import numpy as np
ds = DataSource(exp='tstx00517',run=19)
myrun = next(ds.runs())

for evt in myrun.events():
print('*** evt',rank)
(ps-2.0.8) psanagpu109:lcls2$

mpi4py

  • NOTE: mpi4py conda recipe creates a conda activate script that can dynamically (at run time!) pick up an OS-specific version of openmpi.  That's why we didn't create a conda package for openmpi, since two os-specific versions could not exist in the same conda-env.  Instead we dynamically pick up the right version using a conda "openmpi.sh" script when the env is activated.  I guess the implication is that openmpi is rhel6/7 specific, but I don't understand why (perhaps because infiniband is os-specific?).
  • mpi4py should be kept in a local conda channel (not the cloud) since it is facility specific
  • No labels