No real questions, just that I love reading about this sort of optimization - Abrash's Black Book is a favourite that still gets pulled out every now and then. Thanks for the fun post!
It always astounds me these days when someone manages to release slow software for a desktop computer despite modern systems being orders of magnitude faster than the first desktop computers while often being no more responsive.
Edit: Also I'm amused by how much of a nerve this seems to have hit. I guess some people are defensive about their high performance computing approaches... :)
> I guess some people are defensive about their high performance computing approaches...
Fortunately, when articles about real HPC clusters (aka Supercomputers) make it to the front page, they need no defense, only an occasional explanation that what makes them (extra) special are the high-bandwidth, low-latency interconnects. (Those interconnects make a very strong effort at Fallacies of Distributed Computing numbers 2 and 3[1] and tend to neatly take care of the remaining 6).
To be fair, though, I don't think the claim for the more common distributed systems is that they're "high performance" so much as that they're scalable (and have other benefits of being multi-node like resilience).
[1] The bandwidth may well be indistinguishable from infinite if it exceeds local (e.g. CPU-memory, CPU-CPU, CPU-GPU) bandwidths. I don't think that's yet true for something like multi-socket Intel systems, but it might be possible with enough interface cards. I didn't look at the specs on those POWER9 chips.
I use it from time to time, but I certainly wouldn't consider myself an expert. When I need to do something beyond my current knowledge I search around a bit until I find the awk syntax I need to get the job done. Sorry I can't be more helpful!
I'll second the sibling comment about the original AWK book (The AWK Programming Language).
I'd also suggest just using it more. If you find yourself wanting to use grep, cut or sed (especially if you need more than one!) for a one-liner, try using awk instead, if only for the practice. Once you're accustomed to some of the idioms, simplicity, and built-in looping, it will feel like a more natural, casual text-processing tool that you'd reach for automatically.
Lastly, search for "cookbooks" or libraries or other collections of useful scripts to see how others have used the language, in more advanced ways. Personally, I don't necessarily find it as something to aspire to (i.e. use of the language for complex, multi-line programs) as much as a way to better understand some of the power of the quirkier features that may, otherwise, be less obvious.
One of my favorite non-obvious behaviors is bypassing the input/loop functionality by having the only pattern be BEGIN. This results in an output-only program that makes for a convenient way of playing around with things like printf or just language syntax features (or differences between awk/mawk/gawk), without having to redirect input from /dev/null or worrying about hitting control-D twice and exiting the shell (for those of us who abhor ignoreeof):
It always astounds me these days when someone manages to release slow software for a desktop computer despite modern systems being orders of magnitude faster than the first desktop computers while often being no more responsive.
Edit: Also I'm amused by how much of a nerve this seems to have hit. I guess some people are defensive about their high performance computing approaches... :)