Hacker News new | ask | show | jobs
by throwaway314155 150 days ago
PyTorch MPS is about 10x faster per the README.md.
1 comments

I cut the difference in speed by half by taking the activations on the GPU. Time to sleep but will continue tomorrow.
Have you tried e.g. Mojo that can vectorize/do SIMD without having to do intrinsics everywhere?