I'm not a shader programmer, but I think the key to making proper shaders is to understand that the shaders are a parallel computational system based on vector/matrix math (linear algebra), and being able to conceptualize how to use that for graphic purposes (within the context of a pixel vs vertex shader).
It seems that some people (software developers) can do this, and others struggle. Myself, I'm kinda in the middle - then again, I haven't really tried anything with shaders beyond simple examples and such. But my "aha" moment in parallel calcs of this nature came to me during the course of the original 2011 ML Class (MOOC) taught by Andrew Ng - it used Octave as the programming language, and we had to implement a neural network.
The "serial implementation" was first implemented (and was really slow), but then we were "challenged" to do the same using only vector/matrix math. I struggled wrapping my head around it, but then a lightbulb went on, and I realized what it was all about. Granted, behind the scenes a still very-much serial process was happening (likely using BLAS), but I could understand the conceptual underpinnings (and that they applied to all similar parallel systems - like clustered processors and GPUs). It was a real revelation to me at the time.
Even so, there is something special about these people who can see and do similar things at a "frame buffer" level; some of that stuff is down-right beautiful and amazing to watch (and the code oftentimes so small, as you noted).
That's only a part of the story. The linear algebra and the GPU execution model are like the first step. To write a shader, you have to understand quite a few things about light, materials etc which is the hard part.
It seems that some people (software developers) can do this, and others struggle. Myself, I'm kinda in the middle - then again, I haven't really tried anything with shaders beyond simple examples and such. But my "aha" moment in parallel calcs of this nature came to me during the course of the original 2011 ML Class (MOOC) taught by Andrew Ng - it used Octave as the programming language, and we had to implement a neural network.
The "serial implementation" was first implemented (and was really slow), but then we were "challenged" to do the same using only vector/matrix math. I struggled wrapping my head around it, but then a lightbulb went on, and I realized what it was all about. Granted, behind the scenes a still very-much serial process was happening (likely using BLAS), but I could understand the conceptual underpinnings (and that they applied to all similar parallel systems - like clustered processors and GPUs). It was a real revelation to me at the time.
Even so, there is something special about these people who can see and do similar things at a "frame buffer" level; some of that stuff is down-right beautiful and amazing to watch (and the code oftentimes so small, as you noted).