|
|
|
|
|
by JonChesterfield
554 days ago
|
|
The "proper syscall" isn't a fast thing either. The context switch blows out your caches. Part of why I like the name syscall is it's an indication to not put it on the fast path. The implementation behind this puts a lot of emphasis on performance, though the protocol was heavilt simplfied in upstreaming. Running on pcie instead of the APU systems makes things rather laggy too. Design is roughly a mashup of io_uring and occam, made much more annoying by the GPU scheduler constraints. The two authors of this thing probably count as people who program GPUs for a living for what it's worth. |
|