|
|
|
|
|
by throwaway173738
694 days ago
|
|
I’ve used other peripherals that did this. Under the hood you would have a virtual mapping to a physical address and extent where the virtual mapping is in the address space of your process. This is how dma works in qnx because drivers are userspace processes. The special thing here is essentially doing the math in the same process as the driver. I agree that sounds very nice for distributed computation. |
|
No, you're doing MPI operations on the switch fabric and the IB ASIC itself. CPU doesn't touch these operations, but only see the result of the operation. NVIDIA's DPU is just a more general purpose version of this.