Hacker News new | ask | show | jobs
by Maro 785 days ago
Interesting!

Would this approach (with non-linear learning) still be able to utilize GPUs to speed up training?

1 comments

Seconded. I’m guessing you could create an implementation that is able to do that and then write optimised triton/cuda kernels to accelerate them but need to investigate further