| That's part of the section in Programming Perl that sticks in my memory. From my copy... > Efficiency > ... > Note that optimizing for time may sometimes cost you in space or programmer efficiency (indicated by conflicting hints below). Them’s the breaks. If program- ming was easy, they wouldn’t need something as complicated as a human being to do it, now would they? > ... > Programmer Efficiency > The half-perfect program that you can run today is better than the fully perfect and pure program that you can run next month. Deal with some temporary ug- liness.1 Some of these are the antithesis of our advice so far. • Use defaults.
• Use funky shortcut command-line switches like –a, –n, –p, –s, and –i.
• Use for to mean foreach.
• Run system commands with backticks.
...
• Use whatever you think of first.
• Get someone else to do the work for you by programming half an implementation and putting it on Github.
> Maintainer Efficiency> Code that you (or your friends) are going to use and work on for a long time into the future deserves more attention. Substitute some short-term gains for much better long-term benefits. • Don’t use defaults.
• Use foreach to mean foreach.
...
|
In threaded code it's not uncommon to analyze a piece of data and fire off background tasks the moment you encounter them. But if your workload is a DAG instead of a tree, you don't know if the task you fired is needed once, twice, or for every single node. So now you introduce a cache (and if you're a special idiot, you call it Dynamic Programming which it is fucking not) and deal with all of the complexities of that fun problem.
But it turns out in a GIL environment, you're making a lot less forward progress on the overall problem than you think you are because now you're context switching back and forth between two, three, five tasks with separate code and data hotspots, on the same CPU rather than running each on separate cores. It's like the worst implementation of coroutines.
If instead you scan the data and accumulate all the work to be done, and then run those tasks, and then scan the new data and accumulate the next bit of work to be done, you don't lose that much CPU or wall clock time in single threaded async code. What you get in the bargain though is a decomposition of the overall problem that makes it easy to spot improvements such as deduping tasks, dealing with backpressure, adding cache that's more orthogonal, and perhaps most importantly of all, debugging this giant pile of code.
So I've been going around making code faster by making it slower, removing most of the 'clever' and sprinkling a little crypto-cleverness (when the clever thing elicits an 'of course' response) / wisdom on top.