Hacker News new | ask | show | jobs
by iyzhang 2586 days ago
High-level means not exposing hardware limitations to the application. The primary target applications are datacenter services, which spend much of their time processing network I/O. As network latencies lower to a few microseconds, datacenter applications like Redis will need kernel-bypass because the kernel will become too expensive for them. In our experiments with a 25Gb network, the Linux kernel and POSIX interface costs Redis 60% of its latency.
3 comments

Network I/O is a major bottleneck for Redis but they leave a lot on the table by being single threaded. I can speak from experience because I maintain a Multithreaded Fork: https://github.com/JohnSully/KeyDB

KeyDB can easily get 2-3x the QPS with half the latency.

This is IMHO a wrong analysis. Redis can be scaled by being single threaded by running multiple processes: then if you remove the overhead of the network stack, each process can deliver more QPS, not just better latency. By using threads (which Redis now in parts also does, but and gets 2X performance by making threaded just 0.01% of the code, that is, a single function) you continue to incur in the I/O penalty, just amortized in more threads, but it continues to be a waste. Also the latency you measure as reduced with threads is an illusion: it happens only during benchmarks because the instance is saturated more when running on a single thread. If you measure single-request latencies, they are dominated by the network stack latency.
The lower latency is not an illusion, it is indeed lower latency for servers with high load. If you don't have high load then I agree the need for threads is eliminated - but people using Redis for real work have traffic where this becomes an issue. Multiple processes require clustering or sharding each with its own set of overheads (both in CPU and human terms).

You and I disagree vehemently on this (hence the fork), but I really think your optimizing for your own simplicity not that of the user's. It should be the opposite since the developer has the most insight into the software.

I don't think you understood my comment. What I mean is that regardless of what you think of Redis and threads the fact that doing IO is so wasteful and adds latency and CPU time remains and is a constant.
How do those processes communicate?
> High-level means not exposing hardware limitations to the application.

This seems counter-intuitive.

Hardware limitations mean different abstraction than OS-level APIs, as them to applications.

Even POSIX does not expose hardware limitations.

Rather, high-level in the paper is more like some suitable interface to a wide range of applications. I.e., high-level as it's targeted to be used directly by applications as a portable interface.

I consider POSIX to be high-level. The RDMA and DPDK interface are not.
So what's the API look like?

You said you don't want to make users deal with flow control and hardware details.. does that imply a userspace bypass library which does that stuff for us? Does it look posixy?

Solarflare's OpenOnload or Mellanox's VMA both show up as LD_PRELOADS that overload any traditional socket programming unless you want to code your apps to their API directly.
It looks POSIX-like but uses high-level queues and fixes some issues with epoll. The lack of an atomic data unit and the overhead of the poor epoll interface cost too much to retain for kernel-bypass. Take a look at the paper for more details.
Where’s the paper? After looking at your site, it’s not obvious to me what paper to look at.