Hacker News new | ask | show | jobs
by tmurray 5174 days ago
This post is very accurate. I build APIs for a living (CUDA), and this lines up pretty well with my experience. Writing APIs is very tough, you will get a lot of things wrong, and the fixes available to you after you realize your mistake are all ugly at best.

One quick example:

In CUDA, you have to explicitly copy memory to and from the GPU. We have two basic kinds of memcpy functions--synchronous and asynchronous. Asynchronous requires some additional parameter validation because the GPU has to be able to DMA that particular piece of memory, etc. After we had been shipping this for a release or two, we noticed that our parameter checking for the asynchronous call was missing one very particular corner case and would silently fall back to synchronous copies instead of returning an error. We thought, okay, let's just fix that by returning an error because surely no one managed to hit this.

Absolute carnage. Tons of applications broke. This particular case was being used everywhere. It provided no benefit whatsoever in terms of speed; in fact, it was just a more verbose way to write a standard synchronous memcpy. People did it anyway because... they thought it must be faster because it had async in the name? I don't know.

In the end, we made the asynchronous functions silently fall back to synchronous memcpys in all cases when the stricter parameter validation failed.

3 comments

Well, if it helps any, I've found the CUDA API to be somewhat lacking in features, but more or less robust, and pretty logical to deal with as a developer. Keep up the good work.
Thanks for your work on CUDA, it's really a great tool! My one hope is that Nvidia decides to make it a direct competitor to OpenCL by allowing it to target different platforms (though I recognize that it's not completely up to Nvidia and requires cooperation from others).
What would be the value in competing with OpenCL? CUDA and OpenCL ship pretty much an identical feature set with small changes in the API. The biggest difference is that OpenCL requires you to use buffers+offsets where CUDA allows "pointers" to GPU memory.

It's nice that Nvidia has CUDA so they can go ahead and expose functionality in new GPU's without having to (first) deal with OpenCL standardization. However, for the long term, it would be better if we'd stick to OpenCL so at least parts of source code can be shared between CPU's and GPU's of different vendors.

Competition is a good thing, and CUDA isn't a direct competitor to OpenCL because it only targets Nvidia's platform. OpenCL is obviously the longterm winner because nobody in their right mind would want to lock themselves into a single vendor. I'm saying that I would appreciate it for Nvidia to challenge that, which will drive both CUDA and OpenCL to become better.
Crazy, you work for Nvidia? I love using CUDA for VRAY rendering. It still lacks a ton of the material support but its getting there. From your development, how much more needs to be added to CUDA and Video Cards to get more accurate renderings with full support of materials? I know VRAY has figured out some amazing ways to get around this and support multisided material recently.
CUDA does not know anything about materials, it's only an API to run code on the GPU and transfer data back and forth from main memory to gpu memory. So it's all up to VRAY devs to implement material rendering with the tools they have available.

As far as I know, there will be no revolutionary changes in GPGPU programming in the immediate near future (apart from badass-er GPUs).

Someone should get query MYSQL server working on Cuda.