|
|
|
|
|
by erwincoumans
1534 days ago
|
|
I would recommend considering using NanoBind, the follow up of PyBind11 by the same author (Wensel Jakob), and move as much performance critical code to C or C++. https://github.com/wjakob/nanobind If you really care about performance called from Python, consider something like NVIDIA Warp (Preview). Warp jits and runs your code on CUDA or CPU. Although Warp targets physics simulation, geometry processing, and procedural animation, it can be used for other tasks as well. https://github.com/NVIDIA/warp Google Jax is another option, jitting and vectorizing code for TPU, GPU or CPU. https://github.com/google/jax |
|
Why would you recommend that? It's all way more effort than just writing Cython, especially in a Jupyter Notebook. And Cython code can be just as fast as C/C++ code unless you're doing something really fancy. It's a bunch of work for no benefit.
>Warp jits and runs your code on CUDA or CPU
If someone's writing Cython it's probably because they found something that couldn't be done efficiently in Numpy because it was sequential, not easily vectorisable. Such code is going to get zero benefit from Cuda or running on the GPU.
In general, all your jitted code is not going to be as fast as code compiled with an ahead-of-time compiler like the C compiler that Cython uses. Moreover if you use a JIT then it makes your code a pain in the ass to embed in a C/C++ application, unlike Cython code.