Hacker News new | ask | show | jobs
by dijit 1344 days ago
> We're talking about network stacks and network drivers, not web browsers.

Ah yes, the magic web-browser that doesn't do any kind of networking at all.

> Migrating the network stack from the kernel to a user-land process is not going to measurably slow down web browsers, especially on modern systems with IOMMUs and whatnots.

I don't know how you can possibly assert that, it's contradicting computer sciences' current understanding of operating system design as it relates to kernelmode/usermode switching, unless you're doing weird shared-memory things in userspace... which is terrifying.

> That would require rewriting the network stack and network drivers in Rust

Not really, C and Rust can interop just fine, you can have network drivers that are rust but the actual networking stack itself can remain C, if you want.

> but this is not just about memory safety, Rust code can still be vulnerable in many other ways

The post is literally memory safety bugs.

3 comments

> Ah yes, the magic web-browser that doesn't do any kind of networking at all.

The web browser isn't Netflix trying to serve hundreds of gigabits per second of encrypted video streams from a single server. Do you really need the ability to reliably saturate a 40 Gb/s Ethernet link to browse Hacker News comfortably? You'll hit various other bottlenecks long before performance for practical usages of web browsers will be significantly impacted by a user-land network stack.

As I've said, there are use-cases where extreme throughput and latency requirements warrant a design focusing on performance. Smartphones aren't one of them.

> I don't know how you can possibly assert that, it's contradicting computer sciences' current understanding of operating system design as it relates to kernelmode/usermode switching, unless you're doing weird shared-memory things in userspace... which is terrifying.

Again, not everyone is Netflix. I'd rather have a computer capped at 1 Gb/s speed with a user-land network stack than a computer capable of saturating a 40 Gb/s Ethernet link with a kernel network stack when I'm managing my bank accounts. Most end-users don't need ludicrously fast network speeds to browse funny cat GIFs on their web browsers.

Also, I've contributed code to multiple operating systems (MINIX3, SerenityOS). Running an user-land network stack isn't going to turn your 1 Gb/s Ethernet card into a 10 Mb/s Ethernet card.

> Not really, C and Rust can interop just fine, you can have network drivers that are rust but the actual networking stack itself can remain C, if you want.

As far as I can tell, the bug is in the network stack itself. A network driver written in Rust wouldn't immunize your Linux kernel here from this bug.

> The post is literally memory safety bugs.

The consequence is about computer security, of which memory safety bugs are but one cause among many.

> The web browser isn't Netflix trying to serve hundreds of gigabits per second of encrypted video streams from a single server.

Ironically, server workloads are the ones that are increasingly moving to networking stacks that run in user space, using frameworks like DPDK, with performance as a motivator: https://en.wikipedia.org/wiki/Data_Plane_Development_Kit

Of course, there are some caveats - from my understanding, typical DPDK use cases would turn over the entire NIC to a single application, meaning you aren't contending with sharing the network between multiple, potentially adversarial user mode processes. This is fine for a server, but not really appropriate for a PC or smartphone.

Yes, the way Netflix and Co. are using Userspace drivers is by passing entire devices to a single application.

There's no general purpose IPC happening there.

Netflix interestingly (rather than focusing on DPDK/user-space techniques) seems focused on increasing the throughput of kTLS on their CDN appliance boxes so they can simply sendfile(2) right out of VFS cache in kernel space for the bulk of the data plane. An alternative pathway to the same goal of increasing throughput by colocating your general data and your network stack state in the same context.
I wonder if io_uring will be able to maintain competitive against DPDK-like approaches. Multi tenant solutions are more attractive and seem like they could be extremely competitive since they should be largely equivalent in the case that you have a single tenant.
> unless you're doing weird shared-memory things in userspace

Shared-memory things in userspace, i.e. buffers shared between 2 distinct user processes are no weirder than buffers that are shared between a user process and a kernel-mode driver. In both cases the buffers cannot be accessed by third parties.

Moreover, the transfer of data between 2 processes through a shared buffer can be done without any context switch (which could be slow), if the 2 processes are executed on distinct cores. Therefore having the network device driver as a distinct process does not have to cause any reduction in performance, if the means for inter-process communication are chosen wisely.

For any device driver that is implemented as a user process, the kernel can enable direct access to any I/O ports and memory-mapped I/O areas that are needed by the device, so the device driver can work in user mode without requiring any context switches.

Such direct I/O access cannot be enabled for ordinary processes, because those are not trusted enough and also because the direct I/O access could be enabled only for a single process at a time.

A dedicated device driver process solves both the trust problem and the multiple access problem equally well as a kernel-mode driver.

Things are more complicated. You can indeed have a very fast network driver in userspace (in fact for many use cases userspace networking is faster than the kernel). But where do you put the rest of the network stack?
> Ah yes, the magic web-browser that doesn't do any kind of networking at all.

They clearly didn't claim that. Your webbrowser being slow nowadays is not because it needs to do some networking.

They are claiming a loss in performance is ok.

I am claiming that people keep making this claim and it no longer holds true because software is already losing too much performance for the value we get back.

That's my whole thesis.

Your claim assumes that a small loss in performance in networking will lead to a loss in performance of the overall web browser, which is only true if networking is the bottleneck while browsing. And it usually isn't.
Ah, so you think the only thing I do with a computer is use the browser? That's weird, I was just making an example of something that is so slow that is literally unworkable in the modern day already.

Impacting networking affects the entire machine, especially in so far as a computer is increasingly just a dumb terminal to something else.

Look, If you make network requests potentially 20% slower then the browser performance will be impacted too, it's so obvious that I'm not sure how I can explain it simpler.

By how much? I am not sure, but you can't say it won't be slower at all unless we're talking about magic.

Pretending that it's trivial amounts of performance drop without evidence is the wrong approach. Show me how you can have similar performance with 20% increase in latency and I will change my stance here.

As it stands there are two things I know to be true:

Browsers rely on networking (as do many things, btw) and software is increasingly slow to provide similar value these days.

The point is that most users and use-cases of networking don't have high requirements on bandwidth or latency that warrant a network stack design focused on high performance. Let the ones who want to live on the edge do so if they want, but don't force your high performance, one-bug-away-from-total-disaster network stack design based on your own (probably overblown) requirements on everyone else.

Grandma doesn't care if her tablet can't saturate a WiFi 6 link. Grandma doesn't care if her bank's web page takes an extra 75µs to traverse the user-land network stack. But she will care a whole lot if her savings are emptied while managing her bank account through her tablet. Even worse if her only fault was having her tablet powered on when the smart toaster of a neighbor compromised it because of a remotely exploitable vulnerability in her tablet's WiFi stack.

Or are you suggesting that grandma should've known better than to let her tablet outside of a Faraday cage?

> Pretending that it's trivial amounts of performance drop without evidence is the wrong approach.

Amdahl's law begs to differ. If it takes 5s for the web site to arrive from the bank's server, spending 5µs or 500µs in the network stack is completely irrelevant to grandma. Upgrading her cable internet to fiber to cut these 5s down to 500ms will have much more positive impact to her user experience than optimizing the crap out of her tablet's network stack from 5µs down to 1µs.

What an incredibly weak argument, I'm disappointed to read it.

We're not talking microseconds, we're talking a fundamental problem in computer science for 30 years which is no closer to being solved.

We're talking about a classification of bugs which are solved by other means rather easily that do not take an unknown performance penalty on one of the slowest to improve component of modern computers.

Grandma isn't losing anything due to this, heartbleed: this ain't. Spectre: this aint. and crucially we have the tools to ensure this never happens again without throwing our hands up in the air and saying "WELL COMPUTER NO GO".

If you're actually scared, I invite you to run OpenBSD as I did. you will learn very quickly that performance is a virtue you can't live without, a few extra instructions here, a lack of cache on gettimeofday() and suddenly the real lag of using the machine is extremely frustrating.

And again, for the final time I will say this: we can fix this and make it never happen again without any loss in performance.

that you keep advocating a loss in performance tells me that you've spent a career making everyones life worse for your own experience, I am not a fan of that mentality.

or maybe I've worked in AAA Game Dev too long and we don't get the luxury of throwing away performance on a whim.

You are assuming copying around buffers won't consume any cpu? Maybe it's perfectly fine, maybe it's not. But it needs some experiment before we can handwave it.