Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

We have some evidence that running this can fix the problem with teb to drp DrpEbReceiver process (the "383" (counting from zero) or "384" event problem) but maybe have to run eblf_pingpong in the "right direction" ("-S" maybe has to be on the broken node?).  The reason for this number is that libfabric by default has 384 buffers in the "completion queue", and somehow the completion queue is getting stuck.

Run this on two different nodes:

...