Hacker News new | ask | show | jobs
by juvoly 32 days ago
Increasingly (for instance ADSP podcast [1]) those in nvidia's inner circle are advocating against writing your own CUDA kernels. (Unless that's your full time job at nvidia, that is).

[1] https://adspthepodcast.com/2024/08/30/Episode-197.html

4 comments

That would be cool but nvidia released blackwell and still have not released unbroken kernels for sm120. Sm120 is not the data center gpu, so it doesn't get its love. So we can't depend on nvidia to do the right thing is my point unfortunately
It’s not about whether you work at Nvidia. Avoid writing CUDA kernels if there are higher level libraries that do what you need. Do write CUDA kernels if you want to learn how, or if you need the low level control, or to micro-optimize. Being able to fuse kernels to avoid memory traffic or get better specialization is also a reason to reach for raw CUDA. Just consider what’s the right tool for the job…
I don't think writing CUDA is a good way to do this tbh
To do what? If you need the highest performance GPU kernel performance on NVidia HW, using CUDA is the way to go.
Writing efficient CUDA code is very, very difficult; most CUDA code is not actually good at utilizing the hardware. It is much easier to write performant code in higher level languages (and most people are doing exactly this).
That all depends on what you’re doing. Like I said, if a high level lang or lib supports and fits your goal well, then yes you should use it. I don’t know what most people are doing, but it’s fair to say that a lot of people can use a higher level language.

If you’re trying to learn CUDA, then using a higher level language is not the best approach. If you already used a high level language and found that your performance is lacking and could be better if you could fuse some of your kernels, and avoid some of the memory round-trips, then moving to something lower level is called for.

I’m suggesting it’s better to think about your goals for one minute and understand the basic choices than it is to assume there’s something that works for everyone’s goals, and higher level languages don’t meet everyone’s goals.

I think there are very few things that should be written in CUDA and many of them are just people who like to write CUDA for the fun of it
That advice seems like nonsense. It's like saying avoid C because you can use Python, or avoid writing a graphics engine because you can license Unreal.
Not at all, the advice is like use SDL or Raylib instead of writing your framebuffer blitter in inline Assembly to call from C.
I bet you will learn alot doing that though
Depends if the purpose is learning or actually delivery something on the same amount of time.

Each one has their place.

can very much agree about not writing stuff like reductions yourself, unless you have good reason to. but this sort of feels like another "implement everything with <nvidia stuff> and you'll have a great time!! (but also coincidentally get locked in even more to Nvidia hardware)"