Hacker News new | ask | show | jobs
by wizofaus 1197 days ago
I'm with you - the violations of DRY I still see regularly are clear cases of copying and pasting exactly the same logic (or magic literal value) when there was no reason not to put it in a helper function or named constant that could be referred to in both places. A code review I did yesterday had that - there were 5 or 6 lines of code that, starting with a particular regex, did some data massaging. It was determined in some cases a different reg-ex was needed for a second pass over the data, and so the submitter had simply copied the 5-6 lines of code and just changed the reg-ex used. I see that sort of thing 20 or 30 times more often than code that gets itself into knots because of excessive abstraction trying to avoid code repetition.
2 comments

Some people feel productive when they can type a bunch of code quickly. And the only way to do that is to get muscle memory for writing the same code over and over and over.
I think I can categorically state that virtually every DRY violation I've come across involved little more typing than Ctrl+C Ctrl+V.
I’ve dealt with a few bad coding patterns over the years that were not copy and paste.

Though there have been a couple memorable occasions where I had to completely remove a bad pattern from the code to get people to stop using it.

To be fair, in your example it sounds like the repetition is very local and easily recognised for what it is. Not ideal, but hardly a poster child for when DRY is impactful.

If the change was otherwise good, I would remark on the repetition as "here's how I would write it differently" and not "go back and fix it now".

The first time someone changes four of the cases the same way but misses the fifth, though. That sounds like a good time to refactor.

Just came across another slightly more interesting example. We have code that has to do essentially the same thing for three different document types: it loops through a passed-in list and for each item, adds an object to a document, then initialises that object with details from the item. The "logic" in all 3 cases is exactly the same, yet there were 3 different implementations, one for each document type. The only difference between the 3 is that the function you need to call to add the object is different for each document type, and the library we're using (the MS DocumentFormat.OpenXml library as it happens) doesn't provide an abstraction for just calling the same function regardless of document type. In fact, creating such an abstraction wouldn't be complex at all - if you look at the decompiled MS source, they've basically implemented the same function 3 times, but using an internal function not available to us. As it happens there was a bug in our initialisation code (missing null check!), but of course it was duplicated across all 3 functions. I basically had 3 options:

    a) make the same null check fix in all 3 versions, thus resulting in 3 even longer functions with the same logic
    b) collapse all 3 functions into one and provide an abstraction for the "adding object to document" part
    c) refactor the 3 functions to use the same helper function just to do the object initialisation
In the end I went with c) because it required the least amount of code, and the only genuine duplication really is the foreach statement. But arguably if MS had kept their library DRY in the first place we would never have ended up with so much code duplication.
This is a significantly better example! When non-DRYness leaks out over interfaces it becomes much more of a problem.