Hacker News new | ask | show | jobs
by almostgotcaught 641 days ago
no it won't, because while hitting ioctls in python is cute

https://github.com/tinygrad/tinygrad/blob/master/extra/hip_g...

it is definitely not shippable

2 comments

Because it's slow duh
This sounds like prejudice. Have you benchmarked it?
Yes I literally duplicated your approach for my driver stack last week and surprise surprise the FFI overhead into libc is too high.
FFI? This isn't how GPUs work...they are MMIO (mostly)

Those drivers are faster than anything else when used to run fixed command queues (what neural network runs are)

I can't say anything on the performance, but inline assembly in Python is crazy
It's not inline assembly it's just ioctl through ctypes via libc.