Hacker News new | ask | show | jobs
by mbjorling 2618 days ago
It is worth mentioning that the Linux kernel has a new kernel API (io_uring) that changes the whole argument around using libos designs. With the new io_uring library (available with Linux kernel 5.1), peak IOPS per core is 1.7M IOPS... Which beats or is close to SPDK performance[0]. Later updates to the patches improves the throughput even more.

Jens (the author) has done a great writeup [1]

[0] https://lore.kernel.org/linux-block/20190116175003.17880-1-a... [1] http://kernel.dk/io_uring.pdf

3 comments

Jens' benchmark for SPDK quoted there is far off from the numbers we (the SPDK community) measure. We are able to replicate his io_uring numbers though, so we agree that the new interface is a large improvement. We're working to make full benchmarking data available shortly.
Could you please elaborate more on io_uring vs libos? I would like to understand more but I don't really know how they compare...
can you elaborate on how io_uring bypasses the kernel?
io_uring dumps data directly into a ring buffer mapped into the user-level address space. User code is notified by (at least) an updated atomic counter. The user process must be finished with the data before the kernel comes around again to overwrite it. Often that demands the user process or thread is bound to a core which the OS has been forbidden to run anything else on, and the thread does a carefully circumscribed amount of work, rarely including memory allocation, i/o, or even system calls, that may cause it to be "lapped" by subsequent writes.

The idea is that the average time to process a packet absolutely must not exceed the average arrival rate, and the sum of spikes in arrival rate must average out over the size of the ring buffer to less than the process rate.

The hamster process pulling from the ring may just be load balancing to a herd of other threads operating under less stringent conditions, so they might be permitted i/o.

tks for the great explanation!