Hacker News new | ask | show | jobs
by rockorager 331 days ago
Oh that's cool. `find` is another tool I thought could benefit from io_uring like `ls`. I think it's definitely worth enabling io_uring for single threaded applications for the batching benefit. The kernel will still spin up a thread pool to get the work done concurrently, but you don't have to manage that in your codebase.
2 comments

I did try it a while ago and it wasn't profitable, but that was before I added stat() support. Batching those is probably good
and grep / ripgrep. Or did ripgrep migrate to using io_uring already?
No, ripgrep doesn't use io_uring. Idk if it ever will.
Curious: Why? Is it not a good fit for what ripgrep does? Isn't the sort of "streaming" "line at a time" I/O that ripgrep does a good fit for async io?
For many workloads, ripgrep spends the vast majority of its time searching through files.

But more practically, it would be a terror to implement. ripgrep is built on top of platform specific standard file system APIs. io_uring would mean a whole heap of code to work with a different syscall pattern in addition to the existing code pattern for non-Linux targets.

So to even figure out whether it would be worth doing that, you would need to do a whole bunch of work just to test it. And because of my first point above, there is a hard limit on how much of an impact it could even theoretically have.

Where I would expect this to help is to batch syscalls during directory tree traversal. But I have nonidea how much it would help, if at all.

I believe that io_uring does not support getdents (despite multiple patch series being proposed). So you'd get async stat(), if you need them, but nothing else.
Oh. Well, that's definitely a deal breaker then. ripgrep already avoids stat calls on most files.