Hacker News new | ask | show | jobs
by almostgotcaught 640 days ago
Yes I literally duplicated your approach for my driver stack last week and surprise surprise the FFI overhead into libc is too high.
1 comments

FFI? This isn't how GPUs work...they are MMIO (mostly)

Those drivers are faster than anything else when used to run fixed command queues (what neural network runs are)