Hacker News new | ask | show | jobs
by dragandj 2253 days ago
Thank you so much, that's phenomenal news for me! (Since I can make neanderthal code go at 23ms (GPU) and 3XX ms (CPU) when I implement it as NumPy/CuPy/PyTorch does (sans float64 conversion, of course) You saved me from having to fiddle with Python (which I don't particularly enjoy). Thanks again!

Can you please post your implementation of this function, here, so I can try it on my machine and compare it to Neanderthal?