| HN Mirror

Extra hops on the PCIe link has even less of an impact on bandwidth than on latency.

This product is exposing the SSD directly to the host CPU because that's the only way to make the SSD useful. There's approximately zero software infrastructure for directly accessing an SSD from a GPU; nobody's running a NVMe driver and filesystem code entirely on the GPU, and even if you did that would be a non-starter in the consumer space because it would effectively require reserving the entire SSD for use by a single application.

It is possible on Linux to have GPU code issue storage requests over io_uring (if the kernel is polling that queue; the GPU cannot directly issue a syscall to make the kernel start checking for new IO requests). But that request still is handled on the CPU as it passes through the OS filesystem/storage stack before the NVMe SSD is instructed to DMA the requested data directly to/from the GPU's VRAM.

Microsoft's DirectStorage is (among other things) their effort to enable similar functionality for at least some use cases.