You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

June 17, 2024

With Ryan, Larry, Mudit, Ric, Gabriel, cpo

Topics to discuss:

line 96 (https://github.com/slaclab/axi-pcie-devel/blob/3f5a268226f2fe4324add6d68063b27c140ad4b9/software/gpu/src/test_dma.cu#L96) talks to kcu: puts gpu memory addr into kcu. a gpu addr, which is good!
rdsrv415 has a pcie chassis that we could consider
iommu allows to do stuff with wide-open security settings
line 123 is for GPU, ryan says "permissions"
line 132 and 137 are for specific registers
line 152 is a gpu register
line 159 is the gpu writing to the kcu
line 164 gpu polling
line 173 transfers data back to kcu from gpu
line 159 tells the kcu we are done with the buffer

hwWritePtr is a GPU pointer
hwWriteStart is a KCU pointer

Larry says gpu gen4 16lane 50GB/s
gen4 c1100 bifurcated 8+8 matches the GPU, 25GB/s for 8 lanes, 50GB/s for 2 x 8 lanes?  can PGP also bring in 50GB/s?

Larry mentioned the possibility of using C1100 (https://www.xilinx.com/products/accelerators/varium/c1100.html) with a bifurcated pcie bus to increase data into the GPU to hopefully 50GB/s.  But it looks like it only supports 2 QSFP28 which are 100Gb/s each?  So only 20GB/s in total per C1100?  So to support the 1TB/s produced by the largest epixUHR we would need ~50 C1100 and either put them into  ~25 GPUs, or do 50 C1100 into 50 GPUs. 

  • No labels