Hacker News new | ask | show | jobs
Hang your code out to DRY (johan.hal.se)
76 points by hejsna 1622 days ago
11 comments

My first heuristic is: If I change the code in block A, is it assured that I will need to change the code in block B?

My second heuristic is: Can I name the method I wish to de-duplicate in a way that is honest for all cases I wish to cover, yet explains its business purpose?

The more it deviates from these heuristics, the more likely I am to duplicate the code in object oriented programming.

My slightly different heuristic:

If fact X changes, how many places in the code do we have to change it?

If the number is > 1, relying on humans to realize that will, on average, always fail. I know of two ways to fix that.

[1]: DRY it up.

[2]: Write a test asserting that all the places use the same value.

Sadly, most people rely on this:

[3]: We "just have to remember" to do these changes in all places, despite the fact that human memory is unreliable, and the future person working on this code isn't even in the room when we decided this.

Any time I hear a "we just have to remember" variation, either from coworkers or in my thoughts, an alarm goes off.

[2b] is to use the type system to make sure the places are consistent, the practicality of which depends significantly on both particular task and choice of technology.

[3b] is adding comments, which is better than relying purely on memory but can often be missed. I've locally improved on this a bit by including cross-references on the relevant lines, along with tooling such that the referents are automatically surfaced during code review.

I wish I saved the source, but I saw somewhere the term: "Don't repeat concepts"

While it doesn't spell out a word, I think it's much better advice than DRY, and better aligns with your heuristics.

That's actually more or less how DRY is described in The Pragmatic Programmer (20th anniversary edition):

> Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.

Further on:

> Dry is about the duplication of knowledge, of intent. It's about expressing the same thing in two different places, possibly in two totally different ways. [emphasis in original]

And the next paragraph:

> Here's the acid test: when some single facet of the code has to change, do you find yourself making that change in multiple places, and in multiple different formats? Do you have to change code and documentation, or a database schema and a structure that holds it, or...? If so, your code isn't DRY.

Early in my career, a co-worker and I developed the habit, when a bug was fixed, of asking each other "Did you fix it everywhere?"

But that's the problem right there. Why is there more than one place to fix it?

(Now, sure, if you're using a function call wrong, you need to fix it everywhere you call that function, and it's fine that there are multiple such places. That's what functions are for. But if it's something like how a value is calculated... why is there more than one place that calculates the value?)

Related tangent: "AHA" (Avoid Hasty Abstractions) is a decent counter to the over-application of DRY. IME, people reach for DRY too quickly, at the expense of other worthy but more subtle principles.
The opposite principle of DRY is WET. Write Everything Twice
I think the old "remove duplication the third time you write it" is bad advice as well, to be honest.

First off, it excuses some really egregious behaviour like copying entire large chunks of code representing an identical concept just because "I haven't copied it twice yet". I've seen an entire controller and all related views copied wholesale because it needed to be placed in another area of the app (with identical visual and functional requirements)

Secondly, if the code is only incidentally identical, even 3 times isn't enough to say that you should apply DRY. 3 independent concepts that happen to currently share implementation still shouldn't be deduplicated. I think a great example of this is the inherited resources gem in rails that implements a "standard" interpretation of controller actions - which works well at first until your controllers need to do something other than basic CRUD. The fact that say, an update action handles parameters and saving in the same way as some other update action may currently be true, but is not at all guaranteed to be true in the future, even if you have 100 update actions in your app.

If you know what application you are writing, there is absolutely no reason to do WET. You should always know if some code repetition is real or accidental.

But we don't always know what application we are writing.

> I think the old "remove duplication the third time you write it" is bad advice as well, to be honest.

Same. “That’s just DRY with extra steps!”

Exactly. I find it helps to also state that it doesn't mean "Don't Repeat Characters"
I've pointed out in the past that if the characters you're deduplicating don't actually have the same meaning, you're just compressing your code; I've been trying to popularize labelling that kind of aggressive misapplication of DRY "Huffman coding".
It's because you are using the wrong term that it's not caught on. Obviously this is run length coding or maybe LZ coding ;).
On the other hand, if you keep repeating the same character sequence there is probably some structure on your code that you can also abstract away. You are not restricted to factoring business concepts.
There's clearly structure you can abstract away; doing so may be convenient, and that convenience may even make it a good idea. I contend, though, that if it does not represent a "piece of knowledge" that you're factoring out then you are dealing with concerns other than "DRY".

I do agree that a piece of knowledge may not be a business concept.

I use, and teach juniors basically the same rule as your first one.

I phrase it a little differently though: "if something happened in the future that required a change to function A, would the same requirement apply to function B as well?"

I think your second heuristic is valid but a little dangerous, because being good at naming functions is somewhat orthogonal to being good at maintaining code.

From my perspective, naming functions well is a key component of maintainable code.

However, in this case, I use it is an extension of the question "is this the same concept?" if it looks like useFloorWaxOrDessertTopping, it's a good clue that you may have the same lines of code but they are certainly not the same concept.

like many older developers, I have become less ideological about DRY over time, particularly as I have seen it motivate extremely abstract solutions that have proven difficult to understand and maintain

i have coined the term "Locality of Behavior" (LoB) as a competing design principle to DRY (as well as Separation of Concerns, SoC) that advocates more inlining of (potentially repetitive) logic in the interest of code maintenance and understandability:

https://htmx.org/essays/locality-of-behaviour/

pic related:

https://pbs.twimg.com/media/FAInxTuVkAATRrn?format=jpg&name=...

Why does the "does it actually work" go down over time? I would expect to be the other way round: more experience, more chances your software "actually works", even if it's not following strict rules anymore.
There are multiple metrics to describe what “works” even means. You are right that battle tested code can be more stable and fulfill its goals more than new untested code or code written without a customer. At the same time it’s also true that any sizeable codebase, especially production code, will accumulate bugs over time, will almost always slow down over time, and will eventually once large enough become crufty and smelly and unfactorable and difficult and, maybe most importantly, expensive to maintain. Big systems are hard.
> more experience, more chances your software "actually works"

Exactly, so you don't have to think so much about if it will work, you can take that for granted and think more about subtler things.

I'm about 80% sure that this graph eventually ends up back at the initial state.
The axis is labelled "relative importance". If you place non-zero value on readability, you must place less than 100% importance on "does it work".
Why? There is absolute no correlation between readability and working status. It might be obfuscated code that works perfectly, or it might be obfuscated code that explodes every 2 days. And it might be perfectly clear code badly implementing a business need or implementing it perfectly.
i interpret it this way:

when I was younger it was really important to me that the code work the entire time I was writing it

as I got older it got less important and now I make the code "look right" (that is, read well) and then I get it working

> On re-reading Sandi’s original article it says kind of what I remember it saying, but it also… kinda doesn’t? There’s a lot more talk about programmers honoring the abstractions of elders who came before them

That's because the original article is so clearly about tearing down bad abstractions, but a large majority of programmers - based upon discussion about the article - seem to never get past the first part of it.

Given a long enough time horizon, all abstractions turn bad. The solution isn't to not abstract. The solution is to tear them down when they go bad. And if you don't learn to tear down bad abstractions, your codebase will devolve into shit regardless of what you do.

That may be projection on my part, but I feel like lots of programmers (me included, of course) have a hard time accepting code as a living thing, and would rather build something that "lasts forever". I feel like this is the kind of thinking that pushes us to try to make abstractions that cover all of the cases, spend lots of time on things with little value (in a business context) to "make it right", flaws like that.

Reading the "original SOLID paper" [1] was enlightning to me. The initial assumption is that software rots, or gets less flexible with time. The best way to prevent that is to identify the part that is the least flexible/most rotten and replace it. But for that you need two things: being able to clearly identify parts of the software, and being able to replace them. This is where modularity and abstractions comes in. But this is also where the good old delete key comes in. This is where modularity and abstractions comes in.

Building software from parts, expecting to replace them does leads to abstractions, but different ones from building software expecting it to never be replaced. And, in my opinion, the first kind is easier to deal with.

[1]: https://web.archive.org/web/20150906155800/http://www.object...

> Given a long enough time horizon, all abstractions turn bad. The solution isn't to not abstract. The solution is to tear them down when they go bad.

I disagree with this analysis. Abstractions certainly go bad, but I don't think it's correct to say all abstractions go bad.

The solution I took from Sandi's post was two-fold.

* Don't prematurely abstract, it's better to live with a little duplication vs aggressively eliminating it.

* Don't hold abstractions sacred, when you start seeing an abstraction with too many conditionals, consider breaking apart the use-cases to see if there are actually 2 distinct abstractions happening.

I find it fascinating that people are so against inheritance/polymorphism, these days.

That's one of the absolute best ways to DRY. factoring out common base classes is a classic OO exercise. It's possible to drastically reduce the size of a codebase, and the potential error exposure, by doing some simple extractions.

To me, inheritance and polymorphism are two different things. Polymorphism is about different units implementing an interface or equivalent protocol and that rocks. Inheritance is, essentially, dumping a bunch of code into your new class, and most of the time just imposes constraints and breaks API boundaries for no good reason. After studying and doing OOP for about 5 years, I don't see the advantages of inheritance over composition. The only value I see is libraries exposing base classes that enforce behavior on user-written subclasses, stuff like React's Component or Java's HttpServlet. Seems to me we can have polymorphism without subclassing as long as the programming language has a half-decent type system.
Yeah; "favor composition over inheritance" remains good advice. "Classical" OO inheritance is brittle and often harmful.
Well, this is one of those "yes and no" situations.

The biggest argument that I hear against OO, is that "someone may misuse or misunderstand it."

I feel that this reflects the tech industry's obsession with hiring armies of relatively inexperienced developers, and then cycling through them, because we don't do what it takes to retain people.

I like composition. I use it often. It is not a "one size fits all" solution to anything; just like OO isn't.

"Reduce state" is another big rallying cry. Good advice, for algorithms, multithreaded service providers, and engines. Not so good, for UI, and, in many cases, device control.

I have spent the last couple of days, working on the login screen for the app I'm developing. It's loaded with state. That can't be avoided, and negotiating the several different states that this -seemingly- innocuous screen can have, is not for the faint of heart, but it needs to be done right, because it's the first screen our users see. It also optionally implements Sign In With Apple, which brings its own baggage. The users' experience must be absolutely frictionless, while also being very secure. The work has involved the server (PHP), the SDK (Swift), and the app, itself (also Swift). I'm not done. I keep uncovering corner cases.

I'm just not a fan of "Don't use X, because X is bad, and you're a bad programmer, if you use X." The tech industry has been dealing with this, since the GOTO wars.

Most of my projects are a hideous chimera of decades-old techniques, mixed with cutting edge stuff.

If someone wants to work on it, then they need to have their stuff together. I'm not going to "dumb it down," but I need to do a lot of documentation (I write about how I document, here: https://littlegreenviper.com/miscellany/leaving-a-legacy/).

Here's an interesting thing that happened to me, some time ago, and I decided to write about it: https://littlegreenviper.com/miscellany/swiftwater/the-curio...

> I feel that this reflects the tech industry's obsession with hiring armies of relatively inexperienced developers, and then cycling through them, because we don't do what it takes to retain people.

That is true (and it won't change, so we need to behave like it won't...), but that is not the only reason.

An inheritance hierarchy in a large program that makes sense today might not make sense in a year or two. Refactoring it is hard. Refactoring a composition-based solution is easier.

Composition-based solutions are also easier to test: inject mocks. It's a lot easier to mock a compositional dependency than it is to mock behavior that's inherited from a parent class.

Given that composition is easier to maintain and test, and that it can achieve the same functionality as inheritance, I've pretty much stopped using inheritance. And I write Java 99% of the time.

I write Apple UIKit apps. I am looking forward to using SwiftUI, which is designed to afford use of Protocol-Oriented Programming, reactive/observer stuff, and a lot of lower-state stuff. I think it would have made the work I've been doing in the last few days, much easier.

But if you use UIKit, then it's fairly important to use classic MVC (not MVVM), as that is what the SDK was specifically designed for. Trying to coerce it into other models just causes a lot of extra pain and complexity.

Also, there are models that have been specifically designed (I won't talk about which), to introduce extra complexity. These are made to allow a design to be "broken up," so that parts can be assigned to different developers.

> and it won't change, so we need to behave like it won't...

I sincerely hope that you're wrong. It's been an unmitigated disaster.

While inheritance can certainly work, especially in terms of UI development, the issue is when you are talking more abstract concepts. At that point, it can become really hard to figure out the right lines for what should be inherited vs composed.

In my experience, poorly composed code is simply easier to understand than code which poorly applies inheritance.

For me, inheritance is best used lightly. The obvious smell is when you end up with methods that don't apply to all the base classes. Or, said another way, when a super class it has a superset of capabilities for a base class.

A good example of this is Java's collections, which, for the most part, are quite good with their inheritance. However, because the base classes have mutable methods, it makes it a pain to deal with unmodifiable collections. Java's mistake is they should have had the default collection be unmodifiable and had sub classes which added mutation capabilities. List and MutableList, for example.

> After studying and doing OOP for about 5 years, I don't see the advantages of inheritance over composition.

I mean, "prefer composition over inheritance" was in the GoF book which is from 1994, and is one of the core OOP books. If you've just been studying OOP for five years, it's likely that you weren't even BORN when this was already industry-level good practice.

Universities in my country are hopelessly stuck in the early 90s :)
I don't think people are necessarily against polymorphism or factoring out common logic, but they're definitely against big class hierarchies in the old "OO" style.

Plain data structures with polymorphism via traits/interfaces/protocols seems like it's becoming the more popular way to handle these problems (I have no way to prove this, of course), and I prefer that way as well.

It's just the worst though when a dev sees two chunks of code that happen to be "shaped" the same and decide to tie them together with one abstraction. It's like seeing two different cables in a building that carry radically different signals but happen to go along the same path for awhile, and someone comes along and zip ties them together. You may have wanted to be able to route one of them completely differently, or maybe you wanted to add one more of the same cable, and now you have to cut up all the zip ties and either re-apply them (shove it into the existing abstraction), or bundle them up some other new way.

There's something to be said for preferring composition over inheritance, but in that case overly DRY code consists almost entirely of glue. As with almost anything it's a matter of choosing the right tool for the job.

The only code style advices that I've found to hold nigh universally are the following:

- The best code is no code

- Don't end classes in 'er' or 'or'

Coding paradigms are good when they let you do those things and are bad when they don't do both of those things (i.e. they result in more code or clases ending in 'er'; a class named 'Helper' is a code smell worse than sulfur dioxide)

Almost every one of my classes (in my UIKit app) is a ViewController. I don't really need to write anything more.

So they pretty much all are "er" classes. ¯\_(ツ)_/¯

So what you're saying is you've got a bunch of views and most of your programming code is spent on classes that ensures they do the right thing?

I'm not ruling out that this is an illustration of the problem with classes ending in -er.

Not saying it's great.

It's just the way you write iOS apps (at least, the original design way).

It's classic MVC. Most of the action happens in the controllers.

BTW: I totally agree that the best code I write, is the code I don't write.

Simple fix, name it HelperClass instead?
Because a lot of us learned the hard way that inheritance can very easily lead to a dark place. Changing behaviour on a grandparent object in specific circumstances related to the implementation of a child object gets really nasty.
I came to conclusion that very few people actually understand what a good OO design is and ever fewer have skills to implement it. All this talk about SOLID patterns and what is more likely to result in overengineered monstrosity than in something elegant and easy to maintain.
One way to understand OO, is to write OO code in a non-OO language (like C). When you need to basically write your own vtables, you really understand how things work.

I had to do that, in the early '90s, because I was writing an SDK that had to be implemented in C, but used object abstraction. It ended up being used for over 25 years.

I think a lot of people have been burned by inheritance done wrong.

With a lot of other techniques, if the implementation isn't perfect and you don't own the library you're using you still have a lot of power available to patch things up and make everything play nicely together. Inheritance mixes implementation details with your type checking, so in a lot of languages that can make it extremely painful to turn a body of almost-good-enough code into something that's actually usable.

You don't have to use OO though. We're using an OO-oriented language at work, but I do a lot of de-duplication of code by extracting common functions.

If needed I can delegate the specialization to an interface or in a function reference parameter (so anonymous functions can be passed) rather than in a subclass.

When implementing the interfaces I might use subclasses though, if I have some very similar variants.

Base classes tend to limit what your abstractions are though. In practice I've found that while different classes share common behaviour, it's not always so straightforward that it can be arranged into a tree. A mixin type approach, where behaviours can be attached to different classes (particularly if those behaviours don't define any new state) works much better.
Abstraction is inherently not about deduplication, it is about capturing intent and meaning. About the universality of certain concepts within your code base. Either based on your problem domain or in the context of your application architecture.

Once you abstract only to shorten your code you’ll likely regret it quickly.

If there is one single article about programming that I positively hate it is 'duplication is better than the wrong abstraction'. As the Jason Swett article points out the article seems to install a sort of fear of refactoring. If there is a 'wrong abstraction' nobody will every change it and now we are doomed to live with this wrong abstraction for all of eternity. The wrong abstraction can be turned into the right abstraction or can be undone if it is really not going anywhere. If that is what is happening at least people are trying to improve the code and if people try something it will eventually work. In many cases a bad code base is difficult to change because it is wrong in so many respects that it is difficult to tell where to start. If there is an attitude of refactoring and improvement things that are bad can be taken out quickly. Now, one should, of course not be stupid about removing duplication. If two functions just look vaguely similar but this is more of a coincidence than something that occurs because of the nature of the problem that these two functions are solving then they should absolutely not be one abstraction.... I suppose one might need to point that out to some developers but certainly not to ones who have been developers for some time and who actually have some talent as developers.
Bad abstractions tie together components that shouldn't have been tied together.

Too many bad abstractions are how you quickly end up with that "Bad code base" that you believe is difficult to change - Things are wrong because they're tied together in ways that don't actually make sense, and changing code to support refactoring one use-case creates a wave of cascading changes to other places those abstractions are touched/consumed. If you miss one, or forget an edge case, or have a skimpy test suite - suddenly that refactoring you're so keen on is what's introducing new things that are "wrong" - because they shouldn't have been tied together but were, and you don't understand or remember all of the edge cases.

Basically - My rebuttal is this: It's very easy to refactor a codebase with duplication and introduce an abstraction for the current behavior. It's very HARD to refactor a codebase riddled with abstractions that shouldn't be there.

This means that by default - abstractions should only be introduced very carefully. Refactoring is fine, but you're paying more to refactor a bad abstraction than to refactor duplication. Good programmers understand that most of their value isn't in what their code looks like - it's in what it does for the users. Duplication can feel dirty, and it tends to trigger a "puzzle game" mentality in a lot of programmers, who want to fit the pieces together to make it pretty. AVOID THIS INSTINCT.

Agree completely.

I've dealt with too many code bases that didn't follow this advice. It's extremely expensive to cut down capabilities from code because they are overly coupled.

DRY tends to create things like utility classes and deep dependency trees. For example, I saw a non-ui code base that pulled in JavaFX to use their pair class.

I really like WET (write everything twice). It also fits nicely with the rule of 3.
I really don't like either "the rule of 3" or WET because I think they take the focus off the part of DRY that matters. It shouldn't be about syntactic repetition, but (as originally stated) about repetition of pieces of knowledge.

The focus on syntax misleads in two ways. First, it sometimes motivates consolidation of things that are merely coincidentally syntactically similar (pushing back on this is most of where WET and Ro3 are useful, when they are), which makes it harder to update the code when one "piece of knowledge" needs updating but not the other(s). If you have 10 things that all happen to be identical today, but any of them might change independently in any direction, combining them isn't "more DRY".

Second, the same piece of knowledge may be repeated in different syntax. If your setup means you have to say "there's a button here" one way in HTML, another way in JS, and another way in CSS, then combining those would be "more DRY" even though no syntax repeats. This isn't to say that combining those (say through code generation?) would necessarily be better - DRY is a guideline to be balanced against other guidelines; but I do assert that it is a more useful guideline when we think in terms of duplicated knowledge than in terms of duplicated syntax.

The problem I’ve seen with the “rule of 3” in real life is that by the time someone is writing a similar implementation by the third time, 5 years have passed and the entire team has rotated, so the programmer doesn’t have enough context to DRY anymore, and then the failed “big refactor that breaks corner cases” happens.

I find a mentality of _striving_ for DRY by default — even if end up choosing to duplicate for pragmatic reasons — to be beneficial in keeping programmers looking around for opportunities and refining the understanding of the context, with a better chance of incremental progress.

That's not a problem with "rule of 3", it's a more general problem: Misunderstanding these "rules" and "principles". Specifically, trying to treat them as absolutes. "We must repeat ourselves three times to comply with 'rule of three'." Well, no, that's silly. It's a heuristic, a guideline, not a law that can never be broken. If you spot a clear case of real duplication even before creating the duplication (easier with experience, either total or with the system under development), then you can clean it up earlier. If you can't see the duplication or are uncertain about the duplication ("Is this real duplication, or just coincidence? Am I going to change 90% of the code after all the modifications are done or just one value which could become a parameter?"), then go ahead and repeat yourself.

The same issue arises with YAGNI. People often jump too quickly to shouting YAGNI when the reality may be, and again this comes from experience-enabled judgment, that you are going to need it. "Don't make it a parameter, we aren't using it yet and don't know that we will." "But we do know, it's in the customer requirements that we don't hardcode the database server name everywhere, also it's just sensible."

These rules, principles, guidelines, laws, or whatever term gets assigned to them are there to help. They are not there to be excuses to stop thinking, but to provide a structure around thinking and a way to discuss with other people. But that structure is not absolute, experience and judgement can lead to breaking any of these rules at any time based on the present situation. That present situation that only you (and your team) know, but not the people who discovered, created, or coined the rules. They can only offer advice and guidance and not absolute instruction.

I would only call code without proper abstractions WET (Winnow Everything Thrice). What acronym can we fit into "damp"?
DRY After Multiple Permutations
Don't add multiple parameters.

Delete all multifunctional programs.

Don't AMPlify code.

+1 for Rule Of 3
I once read a comment I wish I'd saved.

It goes along the lines of: W beats X, X beats Y, Y beats Z; in terms of what principle you'd like to apply to your code. One of these letters was essentially representing DRY. I summed things up pretty nicely. Does someone happens to remember?

I'm willing to bet it was this, because I was so struck by it I saved it:

>I try to optimize my code around reducing state, coupling, complexity and code, in that order. I'm willing to add increased coupling if it makes my code more stateless. I'm willing to make it more complex if it reduces coupling. And I'm willing to duplicate code if it makes the code less complex. Only if it doesn't increase state, coupling or complexity do I dedup code.

https://news.ycombinator.com/item?id=11042400

edit: and of course it's from a long and worthwhile HN thread on Sandi Metz's original article which was the start of the back-and-forth resulting in the article for this thread

> I'm willing to bet it was this, because I was so struck by it I saved it:

This is indeed what I was looking for. I too was struck by it. Thank you SO much.

Can't we all just agree that none of these solutions are perfect and you might actually have to change what you do depending on the circumstance?
But DRYing your code can cause it to shrink...
That's a good thing.
Duplication can be better than integration.