Hacker News new | ask | show | jobs
by geph2021 805 days ago
As far as I can tell, its optional dependency is Open MP, not CUDA. Doesn't seem directly dependent on CUDA.
2 comments

The plan is to eventually implement with CUDA:

"Currently, I am working on [...] direct CUDA implementation, which will be significantly faster and probably come close to PyTorch."

Yes, a quick skim of the code only shows openmp dependency. The C/CUDA reference might have meant to be C/OMP .

Although I wonder if it would work well with GCC PTX OMP offloading.