Hacker News new | ask | show | jobs
by marcandre 746 days ago
I'd love examples where DRY can really hurt performance. Typically what matters most in terms of performance is the algorithm used, and that won't change.

More importantly, cleverer people than me said "premature optimization is the root of all evil"

8 comments

This quote is often taken out of context, here's the full quote: "Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%."

If you want a specific example look at something that needs to be performant, i.e. in those 3%, let's say OpenSSL's AES implementation for x86, or some optimized LLM code, you'll see the critical performance sections include things that could be reused, but they're not.

Also the point Knuth is making is don't waste time on things that don't matter. Overuse of DRY falls squarely into that camp as well. It takes more work and doesn't really help. I like Go's proverb there of "A little copying is better than a little dependency."

Knuth was talking about a very specific thing, and the generalization of that quote is a misunderstanding of his point.

Source: Donald Knuth on the Lex Fridman podcast, when Lex asks him about that phrase

I wasn't aware this was discussed, thanks for the pointer! I'm curious now what he says he was talking about ;)
IMO it hurts developer productivity more than performance, because it introduces indirection and potentially unhelpful abstractions that can obscure what is actually going on and make it harder to understand the code.

In raw performance this could manifest as issues with data duplication bloating structures and resulting in cache misses, generic structures expressed in JSON being slower then a purpose-built struct, chasing pointers because of functions buried in polymorphic hierarchies. But I doubt that any of this would really matter in 99% of applications.

Premature optimization is about not making a micro-implementation change (e.g. `++i` vs `i++`) for the sake of percieved performance. You should always measure to identify slow points in expected workloads, profile to identify the actual slow areas, make high-level changes (data structure, algorithm) first, then make more targetted optimizations if needed.

In some cases it makes sense, like writing SIMD/etc. specific assembly for compression/decompression or video/audio codecs, but more often than not the readable version is just as good -- especially when compilers can do the optimizations for you.

A lot of times I've found performance increases have come from not duplicating work -- e.g. not fetching the same data each time within a loop if it is fixed.

Not really. Knuth was talking about putting effort to make a non-critical portion of the software more optimized. He's saying put effort into the smaller parts where performance is critical and don't worry about the rest. It's not about `++i` vs. `i++` (which is semantically different but otherwise in modern compilers not an optimization anyways but I digress).
The optimizations he was talking about were things like writing in assembly or hand-unrolling loops. It was assumed that you’ve already picked an performant algorithm / architecture and are writing in a performant low level language like C.

Also, your digression about modern compilers is irrelevant to the context of the quote, since Knuth talked about premature optimization at a time when compilers were much simpler than today.

That was my point, though. Don't worry about minor possible changes to the code where the performance doesn't matter. For example, if the ++i/i++ is only ever executed at most 10 times in a loop, is on an integer (where the compiler can elide the semantic difference) and the body of the loop is 100x slower than that.

If you measure the code's performance and see the ++i/i++ is consuming a lot of the CPU time then by all means change it, but 99% of the time don't worry about it. Even better, create a benchmark to test the code performance and choose the best variant.

That's not my interpretation. If you're profiling and benchmarking you're already engaging in (premature) optimization. This process you're describing of finding out whether `i++` is taking a lot of CPU time and then changing it is exactly what Knuth is saying not to worry about for 97% of your code. Knuth is saying it doesn't matter if `i++` is slow if it's in a non-performance critical part of your code. Any large piece of software has many parts where it doesn't matter for any practical purpose how fast they run and certainly one loop in that piece of software doesn't matter. For example, the software I'm working on these days has some fast C code and then a pile of slow Python code. In your analogy all the Python code is known to be much slower than the C code, we don't need a profiler or benchmarks to tell that, but it also doesn't matter because the core performant functionality is in that C code.
Knuth says forget about small efficiencies in 97% of your code. Indeed, the `i++` optimization isn't apt to make more than a small difference, even with the most naive compiler, but other decisions could lead to larger chasms. It seems he is still in favour of optimizing for the big wins across the entire codebase, even if it doesn't really matter in practice.

But it's your life to live. Who cares what someone else thinks?

In the general case, it usually depends on the latency of what you'd DRY your code to vs the latency of keeping the implementation local and specialized.

If you're talking about consolidating some code from one in-process place to another in the same language, you're mostly right: there's only going to be an optimization/performance concern when you have a very specific hotspot -- at which point you can selectively break the rule, following the guidance you quoted. This need for rule-breaking can turn out to be common in high-performance projects like audio, graphics, etc but is probably not what the GP had in mind.

In many environments, though, DRY'ing can mean moving some implementation to some out-of-language/runtime, out-of-process. or even out-of-instance service.

For many workloads, the overhead of making a bridged, IPC, or network call swamps your algorithm choice and this is often apparent immediately during design/development time. It's not premature optimization to say "we'll do a lot better to process these records locally using this contextually tuned approach than we will calling that service way out over there, even if the service can handle large/different loads more efficiently". It's just common sense. This happens a lot in some teams/organizations/projects.

This might not be a perfect example, but there's a paper by Michael Stonebraker "One size fits all": an idea whose time has come and gone

It might not specifically be DRY, but still related generic vs specialized code/systems.

https://ieeexplore.ieee.org/document/1410100

> I'd love examples where DRY can really hurt performance.

A really common example is overhead of polymorphism, although that overhead can vary a lot between stacks. Another is just the effect caused by the common complaint about premature abstraction: proliferation of options/special cases, which add overhead to every case even when they don’t apply.

use compile time polymorphism

premature abstraction -> not understood dry (https://news.ycombinator.com/item?id=40525064#40525690)

Langchain. Helps on the initial productivity, is a nightmare on the debugging and performance improvement end.
In an effort to DRY, you add a bunch of if statements to handle every use case.