|
|
|
|
|
by akssri
2256 days ago
|
|
The PyT version above and the both the CPU/GPU versions on Neantherdal work completely in float32. I confirm that the performance of Neantherdal GPU is similar to that in the above PyT version. Yes. The part that is flawed is that you're comparing this to Numpy (CPU)/ Cupy (GPU) both of which coerce the input array to float64 (for precision reasons) before computing the covariance and correlation. You only need to check the output type of the result to verify this (if the pointer to the code is not sufficient). |
|
The reason for that is that CuPy is poorly implemented. And CuPy is poorly implemented because it is constrained by what NumPy does, which, in turn, does stuff that is OK on the CPU, and often translates poorly to the GPU.