Page History

We may not need to use openmpi with Infiniband if we can get similar performance running psana2 on Ethernet for MPI communications. This connections are needed only for transferring small data (11 GB) for this test from Smd0 to EventBuilders and BigData nodes. Here we show the performance of reading 123 GB on 16 files using 7 drp nodes (113 cores: 1 Smd0/ 12 EventBuilders/ 100 Bigdata cores).

Conclusion:

Using OpenMPI with Infiniband: Rate 39.5 kHz (Total Time: 253 s)

Using MPICH from conda on Ethernet: Rate 39.7 kHz (Total Time: 252 s)

Note 1: below are plots from Grafana showing incoming/outgoing traffics

...

MPICH on Ethernet: no noticeable peaks

To run the test:

OpenMPI with Infiniband:

...

myhost = MPI.Get_processor_name()

import numpy as np

...

n = 100000

if rank == 0:

data = np.arange(1000000, dtype='i')

...

if np.sum(data) == 0:

break

...

print(f'rank{rank} on host {myhost} done')

...

https://pswww.slac.stanford.edu/system/grafana/d/C81U-s_mz/drp-dev-io?orgId=1&var-IBmetrics=node_infiniband_port_constraint_errors_transmitted_total&var-job=drpdev&var-group=dev&var-nname=drp-tst-dev&var-node=All

Note on Slurm

To submit slurm job, use following two methods

sbatch submit_slac.sh

cat submit_slac.sh

#!/bin/bash

#SBATCH --partition=anagpu

#SBATCH --job-name=psana2-test

#SBATCH --ntasks=4

#SBATCH --ntasks-per-node=4

#SBATCH --output=%j.log

# -u flushes print statements which can otherwise be hidden if mpi hangs

t_start=`date +%s`

srun python ./test_mpi.py

t_end=`date +%s`

echo PSJobCompleted TotalElapsed $((t_end-t_start))

or

srun --partition=anagpu --ntasks=4 --ntasks-per-node=4 python ./test_mpi.py

Page tree

Versions Compared

Old Version 5

New Version 6

Key

Conclusion:

To run the test:

OpenMPI with Infiniband:

Note on Slurm