Hacker News new | ask | show | jobs
by jjoonathan 2623 days ago
I've done a fair bit of Numba CUDA and I was so happy to throw everything out and switch to C++.

NubaCUDA gave me lots of small problems and a few big ones. The poor support for debug/perf tools and poor integration with other high-level python CUDA code (FFTs in particular) sent me packing, but the number of small problems was excessive in comparison to the size of my code. I had 5 reduced bugs at the bottom of my notebook and two paragraphs of "baggage" at the top to support a tiny little 50LoC kernel: one paragraph for the environment variables and one for patching nubacuda itself for a trivial API incompatibility that hadn't been fixed for the better part of a year. All of this for a tool that provided a diminutive subset of functionality at the intersection of both python and C. I've felt more computational freedom writing BASIC on my TI-83.

CuPy could well have changed that equation!

> incomprehensible error messages when the type inference goes wrong

NumbaCUDA is truly the galaxy-brain of type checking: first it complains loudly so as to force you to provide type information, then it opts to not complain about a mismatch, and then it silently reinterpret_casts a double* to float* behind your back.

I know it's free software and I have no right to complain, but I sure sunk a lot of time into this dead end and regret it.

Spiffy icon though.

1 comments

What’s the difference of NuMBA CUDA and Pytorch or similar?

If you’re doing custom kernels you should take a look at the Julia library CuArray [1] and generic kernels [2]. I really like that I don’t have to dig into C++ and deal with all of the memory and kernel management.

1: https://github.com/JuliaGPU/CuArrays.jl 2: http://mikeinnes.github.io/2017/08/24/cudanative.html

My impression was that pytorch focused on linear algebra / deep learning. The reason I was playing with numbacuda in the first place was because part of my problem did not fit nicely into a (dense) linear algebra framework, so numbacuda's custom kernel support seemed attractive. Does pytorch have a good low-level kernel library? Or sparse linear algebra library?

I love Julia, but I haven't managed to convert anyone else on my team and I already spent my informal exploration budget for the GPU project on nubacuda, so JuliaGPU will have to wait for another time. I'll be sure to keep it in mind, though!

How is the CUDA debug/perf story with Julia? Does it play nice with the nvidia tooling?

Ah, that makes sense. I've only dabbled a little with DNN's recently, but pytorh/tensorflow seemed very targeted toward deep learning. Generic tools seem more useful to me. What are you doing with fft's?

I haven't dug too deep with CudaNative / Cuarray to understand the state of Julia perf debugging. Though here's one post on the topic:

https://discourse.julialang.org/t/cudanative-is-awesome/1786...

In general It's been very pleasant experimenting with gpu programming in Julia. I couldn't quite grok tensorflow code, and it's cool to just declare a Julia array and send it the GPU.