Hacker News new | ask | show | jobs
by tlackemann 1648 days ago
Croteam did wonders with that engine, truly a beautiful looking game inside and out.

Shaders are one of those topics that you can easily get lost in. It was one of my more recent "you don't know what you don't know" topics. The idea that your code will run one time for each pixel on your screen, 60-144x a second (!), is mind-boggling. It's still hard for me to wrap my head around it sometimes and when I finally write something that compiles it feels like magic every time.

2 comments

It gets even weirder when you realize it's not just running your code for each pixel it's running the exact same instructions in parallel for large square blocks of pixels, which makes branching incredibly expensive.
Only as expensive as the slowest pixel in the batch :D
That's not exactly true, it can be slower than the slowest individual pixel. It's not just running the same code for each pixel in parallel across many cores, a single core* actually runs pixels at once and therefore has to have the same program counter on all of those pixels. If two pixels diverge then the core has to alternate between the different PCs and toggle each lane on and off depending on which pixel is currently executing.

That means if you had a shader like:

    if (pixelIndex % 2) {
        longFunctionA();
    } else {
        longFunctionB();
    }
It would actually take twice as long to run compared to every pixel calling the same function. Each core is executing a batch of pixels (a warp) that is evenly split between two completely different sections of code, so it has to alternate between each until they both finish.

* Core might not be the exact right term, Nvidia calls them SMs and other GPU vendors have different names.

With deferred rendering it's possibly running multiple sharers per pixel on the screen. A good example is the excellent Doom Graphics study shows how Doom 2016 is doing this. https://www.adriancourreges.com/blog/2016/09/09/doom-2016-gr...