|
|
|
|
|
by CyberRabbi
1630 days ago
|
|
> As for putting things in threads, I would consider it a huge hack to move open/close. Threads are not and will never be mandatory to have great responsiveness. The POSIX interface was invented for batch processing. Long running non-interactive jobs. This is why it lacks timing requirements. All well-designed interactive GUI applications do not interact with the file system on their main thread. This is especially true for game display loops. The fundamental problem here is that they are doing unbounded work on a thread that has specific timing requirements (usually 16.6ms per loop). As I’ve said elsewhere, this bug will still manifest itself no matter how fast you make close(), just depends on how many device files are present on that particular system. It’s a poor design. Well designed games account for every line of code run in their drawing loop. > This is absolutely a kernel bug. I don’t think that is proven unless the original author can chime in. It’s your best guess and opinion that the author intended to not block on synchronize_rcu but it’s perfectly possible they did indeed intend the code as written. synchronize_rcu is used in plenty of other critical system call paths in similar ways, not every one of those uses is a bug. I would guess you might be slightly suffering from tunnel vision a bit here given how the behavior was discovered. If it is indeed the case the synchronize_rcu is taking up to 50ms I would suspect there is a deeper issue at play on this machine. By search/replacing the call with call_rcu or similar you may just be masking the problem. RCU updates should not be taking that long. |
|
I strongly disagree. A well-designed interactive GUI application can absolutely interact with the filesystem on its main thread without any impact to responsiveness what-so-ever. You only need threads once you need more CPU time.
The POSIX interfaces provide sufficient non-blocking functionality for this to be true, and the (as per the documentation, "brief") blocking allowed by things like open/close is not an issue.
(io_uring is still a nice improvement though.)
> I don’t think that is proven unless the original author can chime in.
This argument is nonsense. Whether or not code is buggy does not depend on whether or not the author comments on the matter. This is especially true for a project as vast as the Linux kernel with its massive number of ever-changing authors.
> If it is indeed the case the synchronize_rcu is taking up to 50ms I would suspect there is a deeper issue at play on this machine. By search/replacing the call with call_rcu or similar you may just be masking the problem. RCU updates should not be taking that long.
synchronize_rcu is designed to block for a significant amount of time, but I did not push the patch further exactly because I would like to dig deeper into the issue rather than making a text-book RCU fix.