| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by juvoly 32 days ago
	Increasingly (for instance ADSP podcast [1]) those in nvidia's inner circle are advocating against writing your own CUDA kernels. (Unless that's your full time job at nvidia, that is). [1] https://adspthepodcast.com/2024/08/30/Episode-197.html

4 comments

halJordan 32 days ago

That would be cool but nvidia released blackwell and still have not released unbroken kernels for sm120. Sm120 is not the data center gpu, so it doesn't get its love. So we can't depend on nvidia to do the right thing is my point unfortunately

link

dahart 32 days ago

It’s not about whether you work at Nvidia. Avoid writing CUDA kernels if there are higher level libraries that do what you need. Do write CUDA kernels if you want to learn how, or if you need the low level control, or to micro-optimize. Being able to fuse kernels to avoid memory traffic or get better specialization is also a reason to reach for raw CUDA. Just consider what’s the right tool for the job…

link

saagarjha 31 days ago

I don't think writing CUDA is a good way to do this tbh

link

nnevatie 31 days ago

To do what? If you need the highest performance GPU kernel performance on NVidia HW, using CUDA is the way to go.

link

saagarjha 31 days ago

Writing efficient CUDA code is very, very difficult; most CUDA code is not actually good at utilizing the hardware. It is much easier to write performant code in higher level languages (and most people are doing exactly this).

link

dahart 31 days ago

That all depends on what you’re doing. Like I said, if a high level lang or lib supports and fits your goal well, then yes you should use it. I don’t know what most people are doing, but it’s fair to say that a lot of people can use a higher level language.

If you’re trying to learn CUDA, then using a higher level language is not the best approach. If you already used a high level language and found that your performance is lacking and could be better if you could fuse some of your kernels, and avoid some of the memory round-trips, then moving to something lower level is called for.

I’m suggesting it’s better to think about your goals for one minute and understand the basic choices than it is to assume there’s something that works for everyone’s goals, and higher level languages don’t meet everyone’s goals.

link

saagarjha 30 days ago

I think there are very few things that should be written in CUDA and many of them are just people who like to write CUDA for the fun of it

link

drnick1 32 days ago

That advice seems like nonsense. It's like saying avoid C because you can use Python, or avoid writing a graphics engine because you can license Unreal.

link

pjmlp 31 days ago

Not at all, the advice is like use SDL or Raylib instead of writing your framebuffer blitter in inline Assembly to call from C.

link

lacedeconstruct 31 days ago

I bet you will learn alot doing that though

link

pjmlp 31 days ago

Depends if the purpose is learning or actually delivery something on the same amount of time.

Each one has their place.

link

bobmarleybiceps 32 days ago

can very much agree about not writing stuff like reductions yourself, unless you have good reason to. but this sort of feels like another "implement everything with <nvidia stuff> and you'll have a great time!! (but also coincidentally get locked in even more to Nvidia hardware)"

link