Hacker News new | ask | show | jobs
by sliverstorm 4408 days ago
To my understanding, the critical difference between framebuffer graphics and 3D API graphics is the processing! In a framebuffer scenario, the CPU does all the rendering. Since CPU is poorly suited to rendering 3D, we have a coprocessor called a GPU. The CPU has to feed the GPU work.

Because the GPU is cutting-edge, there is a certain amount of magic voodoo required for top performance that needs to get abstracted away- maybe this particular model of GPU you have doesn't support some common instruction. You don't want to handle that in your software, you want to hide that in the driver.

Beyond that, the API is also there to make the GPU easier to use. OpenGL is a mess, sure, but to my understanding most developers would pull their hair out and give up if they had to program the GPU directly.

1 comments

Isn't the GPU just a computer that happens to support more parallelism than the CPU? Why not have a simpler API based on general-purpose operations like map/reduce/scatter/gather? Then there would be no need to add new "cutting edge" operations every year. I for one would be happy to use that instead of OpenGL or DX.
> Isn't the GPU just a computer that happens to support more parallelism than the CPU?

Not really. It's like quantum mechanics compared to classical physics.

For instance, "branches" don't work like you'd expect. On a cpu you execute one branch or the other. On a GPU, you get things like both branches execute, but then it just throws away the half that shouldn't have run, but that means you're bottlenecked by whichever branch takes the longest (Or something like that -- the details escape me but I do remember something about CUDA's branching doing weird things). Point being, GPU's are weird. It's nothing like programming a CPU at all.

> On a GPU, you get things like both branches execute, but then it just throws away the half that shouldn't have run

It's not that weird. You don't really have thousands of parallell processors, but a single processor, operating on thousands of values. (Like SIMD on steroids.)

Since all operations must be done identically on all values, a "branch" is really doing both branches and recombining them with a mask of equally many booleans - as you say "throwing away" the unwanted branch.

The GPU microops continue to change for the same reason we got x87, MMX, AVX256... We expect the very best performance from our graphics coprocessor, and sometimes the fastest way to do something is with a new hardware op.

The GPU doesn't get more complicated to support OpenGL; OpenGL gets more complicated to 1) meet the needs of developers and 2) support the GPU.

The real way to stop adding new features & ops every year is to stop caring about whether your GPU is fast.

P.S. The GPU is a computation engine that is more parallel than the CPU, but it is not analogous to a CPU with more threads. A GPU basically cannot branch (if/for/while) worth a damn, for example.

Among other things, yes, the GPU is a computer that supports more parallelism than the CPU. But not in the way you want.

There are a few issues. One is that the GPU does much more than map/scatter/gather (reduce is hard in parallel so it doesn't do that), look into the stages of the graphics pipeline and see what I mean. The other big one is that it doesn't work like a CPU in a lot of ways and making it general purpose like one would loose enough of the performance that it would no longer be useful in many cases.

Really what you're asking for is a super-parallel general-purpose CPU, which really isn't what a GPU is or wants to be.