Hacker News new | ask | show | jobs
by bullen 1400 days ago
Is anything like this for networking done or in the works?
3 comments

For networking the closest equivalent would be TUN/TAP, which lets userspace route either IP packets (TUN) or Ethernet frames (TAP).
If I'm understanding ublk and your question correctly, then yes, there are a lot of kernel-bypass networking options out there, such as openonload, dpdk, mellanox (though they seem to have been absorbed into nvidia). You'll likely need a special/particular network card, an external kernel module, and at least an LD_PRELOAD to use them though.
Is there no way to avoid kernel copying all network data?

I understand the frustration of having the network driver crash but could it not be run in a way that it doesn't bring down the OS?

It seems to me Java would have a no-brainer advantage of a user-space networking option since you're already in a VM!?

When I saturate my HTTP server the kernel takes 30% of the CPU just copying data for no good reason?!

Yes, and also just to note, zero-copy and kernel-bypass are independent. Traditional Berkeley socket syscalls are copy+kernel, io_uring has/will have zerocopy+kernel, openonload provides both APIs for copy+kernelbypass and zerocopy+kernelbypass.
There is XDP.
XDP is kind of the opposite of this, right? It's moving userland code into the kernel.
XDP is a lot of stuff, but I think I have someone around using af_xdp to bypass the kernel network stack and for some (and the filtering and decision of which streams, is done through some ebpf iirc) packets deliver them directly into userland buffer-queues? DPDK also has an AF_XDP backend to bridge your classical DPDK app and AF_XDP sockets.
Ah, that's true. AF_XDP is definitely similar to userland block device offload.
yes, i think there is example code where io_ring is used to get blocks into and out of XDP/kernel.