Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

By replacing Send with Isend. We allow Smd0 to move on after initiating send command to an eventbuilder core. With this overlap, we see that the total wall time improves from 7.4 to 4.4 seconds with 16 eventbuilder cores.


eb=1eb=2eb=4eb=8eb=16
TASKtotal(ms)#occurstotal(ms)#occurstotal(ms)#occurstotal(ms)#occurstotal(ms)#occurs
SMD0GOTCHUNK19641995.3707108620351993.8369108620151975.744108619921964.946108620041983.79131086
SMD0GOTEB56955841.4978108728002779.48108717481964.1136108716761916.0166108716191832.14131087
SMD0GOTREPACK244297.85861087212258.7331087235295.5821087198306.57371087186345.95581087
SMD0DONEWITHEB4857.97610875061.047810875262.743710875361.239510875160.64761087
SMD0GOTSTEPHIST7678.273410877985.686610878387.659610878380.278710878281.98091087
SMD0GOTSTEP8786.373810878689.9510879091.348710879289.2696108788.62431087
total:81178357.261881178357.261852655268.154452655268.154442264477.077742264477.077740964420.284240964420.284240344391.121340344391.1213
rate (MHz)1.2320
1.90
2.3723
2.4426
2.4828

Conclusions/ Known Issues

We gain some performance by overlapping Send with other computation tasks. However, this code with (Isend/ Irecv) crashes with the current real experiment data (tmoc00118, run=463). We need to investigate this issue before continuing this work. 

In additional to overlapping send, we can also perform computational tasks while Smd0 wait for an eventbuilder core to come back (Irecv). This implementation should be explored after the issue mentioned above is solved