Hacker News new | ask | show | jobs
by RussianCow 1822 days ago
I agree with everything except for this:

> 4. Delete code, delete tests, re-write stuff, re-write it again. Painful? Keep doing this till you get to the other side and feel liberation and empowerment, took me years to get past wincing at the idea of re-writing that thing AGAIN

With experience I've realized that rewrites are almost never a good idea[0]. They are time consuming and inevitably introduce new bugs (or reintroduce old ones), and the benefit is almost always marginal. If your architecture allows it, it's better to sandbox working legacy code and leave it as-is than constantly rewrite it.

My own policy is: if you're making significant changes to a unit of code anyway, clean it up or rewrite it; otherwise, make the smallest change necessary and leave it alone. If it ain't broke, don't fix it.

(Obligatory Joel Spolsky article on the topic that everyone has probably already read: https://www.joelonsoftware.com/2000/04/06/things-you-should-...)

[0]: ...with a possible exception if you're at a very large tech company with a lot of resources, where the testing/QA processes are a lot more thorough and it's possible to keep up with constant changes like this. But even then, seeing the amount of bugs introduced with each update to Facebook or Gsuite or any other large piece of tech, I'm skeptical.

3 comments

> With experience I've realized that rewrites are almost never a good idea [...] If your architecture allows it, it's better to sandbox working legacy code and leave it as-is than constantly rewrite it.

What if your architecture is the problem? If you got the domain model wrong and you have deeply-nested complex types used throughout, you will never break free without a clean sheet rewrite. Certainly, you can refactor bits and pieces to conform to a correct model, but it's going to be an uphill battle the whole way. 10x if you are already in production with business state tied to the legacy schema(s).

Most of the horrible things I have seen at code review time have a root cause somewhere in poor domain modeling.

For example: Someone put support for 2 customer addresses as facts directly in the Customer type, so now you can't deal with the new edge case of 5+ addresses per customer, or model the idea that the address might be shared between more than 1 customer (and/or some other business types).

If you didn't model for 3NF/BCNF/DKNF up front, you might as well start over from the beginning in my experience. If your problem domain is not that complex, you can probably survive with something really shitty, but the moment you enter into 50+ types, 1000+ facts and 100+ relations, things are impossible to manage without strong discipline in this area.

I might, grudgingly, support this kind of rewrite, if it was proposed with a domain model that can already support all existing data. If it's proposed with "a new domain model would be so much nicer! We could do it right!" then hell no.

I'd still prefer to do it incrementally if possible.

I don’t disagree in principle, but I have also never encountered architectural problems so deep that it was impossible to solve them piecemeal. I’m sure they exist, and in those cases I’d say a rewrite is probably worth it in the long run, but I haven’t seen it in practice.
My thinking definitely aligns with yours on this:

> if you're making significant changes to a unit of code anyway, clean it up or rewrite it; otherwise, make the smallest change necessary and leave it alone. If it ain't broke, don't fix it.

Don't be afraid to rewrite something when it needs it, and build knowing it's very possible you'll rewrite later.

If there are good unit tests, and it's a sufficiently small/decoupled piece of code, then rewriting is not so bad. These are all self-reinforcing things: small, decoupled code tends to be easy to rewrite; Testable code tends to be decoupled.

Systems always get more complex over time. The key thing is figuring out when your simple component is starting on the path to getting too complex, and taking the time to rewrite it as early as possible -- maybe it should be two smaller components, or maybe the entire organization of that are needs to be different. If you wait, and just keep adding "small" things, eventually you have a monster that's an order of magnitude more complex to deal with.

The other bit of this is writing code knowing you can (and may likely) rewrite it later. If you try to predict the future complexity early in design, 9 times out of 10 you will get it wrong, and you'll end up in a lose-lose situation: you have a overly-complex component to deal with, and it ends up needing a rewrite later anyway. Rewriting this is even harder because you have to undo the unnecessary complexity. This is also known as YAGNI ("You Aren't Gonna Need It").

> If there are good unit tests, and it's a sufficiently small/decoupled piece of code, then rewriting is not so bad. These are all self-reinforcing things: small, decoupled code tends to be easy to rewrite; Testable code tends to be decoupled.

The problem with this mentality is that everyone thinks they write “good unit tests”, but bugs are still found in production. :) You can’t use unit tests as justification for software being reliable; being battle tested in the real world is a much better indicator in practice.

I mostly agree with your other points, although I would still advocate doing these rewrites in as small of pieces as possible, in a way that’s as backwards compatible as possible, instead of all at once. I think you may be saying roughly the same thing, but this thread has shown that people have different definitions of “rewrite”, so it’s hard to tell. :)

> You can’t use unit tests as justification for software being reliable; being battle tested in the real world is a much better indicator in practice.

My point on unit tests to support a rewrite is that it can catch a lot of edge cases, and you can remain reasonably certain you didn't break things if tests are still passing when you're done.

But otherwise I totally agree -- you can have as many tests as you want, with whatever coverage numbers you want, and it will still fall down in the real world. I don't focus on coverage or being dogmatic about TDD. I like to heavily unit test algorithms/regexes/etc -- anything with a defined input and output. At the same time, I hate testing 'glue' code (like an MVC controller), and would way prefer to rewrite it to be so simple it either works or it doesn't work at all (causing a big, obvious failure).

As bugs are found in the real world, in the ideal case write a test which basically guarantees that bug never happens again. Not always possible or takes such a massive effort that it isn't worth it, but usually it pays off.

> I would still advocate doing these rewrites in as small of pieces as possible, in a way that’s as backwards compatible as possible, instead of all at once.

Yes, exactly, that's what I was trying to get at. Big rewrites are hard to get approval/agreement to start, hard to actually do and often unsuccessful. Worse, a team with the mindset "ah, it's okay if we cut corners here, we're going to rewrite this whole thing someday" will rack up a serious amount of technical debt -- and make that big rewrite even harder. Ironically this makes a big rewrite even harder to start and succeed, but also more necessary.

Small continuous rewrites are a way of constantly improving quality, while avoiding a lot of these traps.

Imagine if we made bridges this way... Lets just knock it down and rebuild. Who cares about cost or traffic.

In some ways software engineering is so immature compared with other engineering disciplines.

I believe more in boy scout - "if you touch it leave it better than when you arrived".

:-)