|
|
|
|
|
by rrss
2256 days ago
|
|
> I would love to improve any part of this article, if possible! You should state that the reason the neanderthal version runs faster is that it is doing the computation in lower precision than numpy / cupy. The article walks through an investigation of why cupy's result is underwhelming (is the input data accidentally fp64? is the input data on the cpu? is the computation happening on the cpu?), so you should finish it by explaining that numpy and cupy do the computation in fp64. |
|