Hacker News new | ask | show | jobs
by junon 1656 days ago
The screenshots are from one of my favorite games called The Talos Principle. Curious, is the author of the site associated with it? That game is custom built, looks incredible even on cheaper hardware (e.g. it looked beautiful on my 2015 MBP when I first played it). Crazy stuff.
2 comments

Croteam did wonders with that engine, truly a beautiful looking game inside and out.

Shaders are one of those topics that you can easily get lost in. It was one of my more recent "you don't know what you don't know" topics. The idea that your code will run one time for each pixel on your screen, 60-144x a second (!), is mind-boggling. It's still hard for me to wrap my head around it sometimes and when I finally write something that compiles it feels like magic every time.

It gets even weirder when you realize it's not just running your code for each pixel it's running the exact same instructions in parallel for large square blocks of pixels, which makes branching incredibly expensive.
Only as expensive as the slowest pixel in the batch :D
That's not exactly true, it can be slower than the slowest individual pixel. It's not just running the same code for each pixel in parallel across many cores, a single core* actually runs pixels at once and therefore has to have the same program counter on all of those pixels. If two pixels diverge then the core has to alternate between the different PCs and toggle each lane on and off depending on which pixel is currently executing.

That means if you had a shader like:

    if (pixelIndex % 2) {
        longFunctionA();
    } else {
        longFunctionB();
    }
It would actually take twice as long to run compared to every pixel calling the same function. Each core is executing a batch of pixels (a warp) that is evenly split between two completely different sections of code, so it has to alternate between each until they both finish.

* Core might not be the exact right term, Nvidia calls them SMs and other GPU vendors have different names.

With deferred rendering it's possibly running multiple sharers per pixel on the screen. A good example is the excellent Doom Graphics study shows how Doom 2016 is doing this. https://www.adriancourreges.com/blog/2016/09/09/doom-2016-gr...
The Talos Principle is really one of the best puzzle games I've played. It scratched the same itch for me that Portal did.