Hacker News new | ask | show | jobs
by nos4A2 1381 days ago
The comparison does not seem to be apples to apples. The perf gain seems to come from bundling the fork and exec call allowing the kernel to do vfork like optimizations. I think a fair comparison would be against a new sys call that does fork_exec vs io_uring_spawn, and not fork + exec vs io_uring_spawn.
3 comments

That's absolutely where much of the performance currently comes from, and I did in fact benchmark exactly that and it provides comparable performance (for now). I also noted in my talk that you can get much of the way there by doing clone with CLONE_VM and doing exec from there.

But also, the tricks to make it faster without using io_uring are much less safe than using io_uring. vfork is dangerous.

io_uring is the standard kernel mechanism for doing multiple operations in one syscall, so it seems like the most logical fit for this.

Jens Axboe (io_uring maintainer) and I have many more plans to make this faster.

Leaving that aside, there are many additional capabilities this unlocks. For instance, we can maintain a pool of processes set up and ready to exec.

> I think a fair comparison would be against a new sys call that does fork_exec vs io_uring_spawn, and not fork + exec vs io_uring_spawn.

I agree that's a useful comparison, although the status quo is also worth benchmarking. A fork_exec syscall sounds like a posix_spawn syscall. IIRC, glibc's current posix_spawn is a library function that does a vfork followed by setup operations that portable code isn't allowed to call after vfork.

I think there are two dimensions in which io_uring can be used to amortize/eliminate syscalls:

* sequence of operations: in this case, (v)fork, then setup calls, then exec. This sequence is unusually expensive because doing them separately requires extra page table work (particularly for the fork case, but even for vfork vs having a blank process). A posix_spawn syscall could achieve similar efficiency gains as this io_uring_spawn.

* batches of operations: in this case, lots of processes to spawn. It sounds like he hasn't posted numbers on that and intends to: "He's looking forward to ... supporting a pre-spawned process pool". If that turns out well, then the io_uring approach seems worth doing, and also can be used to implement a new in-user-space posix_spawn, for less total syscall surface area than doing both io_uring and a synchronous posix_spawn syscall. Maybe the supported setup operations also can be in common with io_uring operations for other purposes. I'm curious about that but don't see a link to benchmark code / implementation code / manpage.

I am a fan of io_uring style interfaces and think fork() is completely busted, but I agree that intuitively the comparisons seem to obfuscate where a lot of the perf wins come from

1. posix_spawn() performance seems terrible on Linux. There is just no excuse for it to be worse than vfork()... I know on Linux it is implemented in library code, but on macOS it is a syscall, which achieves most of the stated benefits of io_uring_spawn (batching up all the posix_spawn_actions and then handing them down to the kennel in a single syscall).

2. Again, I think the whole "hope posix_spawn supports the actions you need" is a bit overblown here. It is not like io_uring supports operations for every single syscall that exists either. People keep holding up the lack of some operation in posix_spawn() as a justification for designing a new interface. Just add some new flags and actions (see posix_spawn_file_actions_addchdir_np(), POSIX_SPAWN_SETEXEC, or POSIX_SPAWN_CLOEXEC_DEFAULT on macOS for examples).

Obviously posix_spawn() is a synchronous operation, so in that respect io_uring_spawn is inherently a bit faster for processes that want to asynchronously launch processes, but if Linux implemented posix_spawn() like macOS did then you could achieve the same thing just by adding support for issuing a posix_spawn() operation via io_uring just like any other operation.

Given that io_uring has a lot of momentum and is getting broad support for various operations it probably does make sense to implement io_uring_spawn and then layer posix_spawn() on top of it as proposed in the talk, but I wish the slides called out that a lot of performance gains seem (at least me) to have little to do with the nature of io_uring and more to do with that fact that posix_spawn() is implemented purely in userspace on Linux and can't batch the existing actions because of that.

It looked to me like providing a cleaner interface which still effectively supports everything vfork does, with all the performance gains (and fewer context switches) was in fact the point of the exercise.