Hacker News new | ask | show | jobs
by dragandj 2253 days ago
I still don't get how the fact that someone could implement the same thing that I did in Clojure in PyTorch has anything to do with NumPy and CuPy, or my benchmark?

BTW, your PyTorch code is still incorrect (or so it seems to me although I don't use PyTorch so I can't try it on the computer). The formula for correlation requires division by sigma_x * sigma_y (which has dimension n x n), and you are dividing by (sigma_x)^2 (which has dimension n). So you still forgot at least one L2 operation that computes all combinations of sigma_x_y. A couple operations here, a couple operations there, an edge case here, and edge case there, it adds up. That's why people use NumPy/CuPy after all...

1 comments

- Function in Cupy takes 29.4ms, Numpy takes 427 ms. Happy ?

- Broadcasting semantics + division takes care of the outer-product normalization. This is 2 L1 ops in size of the matrix & the input (~ xSCAL).

Pedantry is still not an argument.

Thank you so much, that's phenomenal news for me! (Since I can make neanderthal code go at 23ms (GPU) and 3XX ms (CPU) when I implement it as NumPy/CuPy/PyTorch does (sans float64 conversion, of course) You saved me from having to fiddle with Python (which I don't particularly enjoy). Thanks again!

Can you please post your implementation of this function, here, so I can try it on my machine and compare it to Neanderthal?