Hacker News new | ask | show | jobs
by foobarian 1571 days ago
But the reason this happens is because of previous painful experiences like the following:

Dev: Hey Steve, I'm working on issue #2312, but it just occurred to me that that if I could just refactor that one method in SuperFactory it'd make code much cleaner and easier to reuse. Just a quick fix!

Manager: Huh. How much more work is it? If you can time-box it to half a day then go ahead.

Dev: Great, it should take just a couple of hours!

The change is merged and deployed, and several weeks go by...

Data Science: Hi team, we are wondering if you know if anything changed in this module in the past couple of weeks. The numbers from non-English speaking domains tanked.

Manager: Uh oh

Dev: Uh oh

Data Science: We just look at data in aggregate with significance only reached with weeks of collection. But at this point it looks like we lost millions of dollars.

Manager: oh shit

Dev: faints

... followed by weeks of post-mortems, meetings, process improvements, if not outright terminations.

4 comments

Allowing technical debt to accumulate like that will, with near 100% certainty, damage your development activity sooner or later. The only exception is if you've already damaged it critically in some other way.

Some contrived example where you might lose significant money because you made a generally good change but it had a bug and that bug was somehow missed by your entire review and testing process and the consequence of that bug was able to go unnoticed for a long time in production and then the result was disastrous isn't really a very compelling counter-argument.

If you subsequently hold weeks of post-mortems, meetings, process improvements and outright terminations, the person who made the otherwise useful change that had a bug should be among the last to get called out, somewhere after the entire management chain who utterly failed to competently organise critical development and operations activities, everyone responsible for QA who couldn't spot such a critical problem early, and everyone involved in the data science who ran such a hazardous experiment without taking better precautions around validity.

Well. It's not a counter-argument to anything, it's an illustration of how we end up with bad codebases, and why specifically in big enterprises. The incentives are set up exactly in the way that lead to it, particularly by making it expensive to clean up tech debt.
Fair enough if that was the point you wanted to make, though in that case I'd argue that the kind of disaster scenario you described isn't specific to big enterprises but to disastrously bad software development organisations. A lot of small organisations think they're operating at enterprise scale and make the same kinds of mistakes!
That has nothing to do with giving devs lee-way.

If you work without tests/QA, you are shooting from the hip. The scenario above as-such should not happen. If it ties in with million-dollar processes, even more so. What you are saying is "We don't trust our process, so we do as little as possible outside authorised tasks"; Instead, you should fix the process. If this led to post-mortems and process improvements, as in QA/dev process, not simply bug-fixes, then why is the process not improving and/or better trusted now?

Also, the original task is described as a "refactor", so the numbers should not be affected - was was it not just a refactor?

> If this led to post-mortems and process improvements, as in QA/dev process, not simply bug-fixes, then why is the process not improving and/or better trusted now?

IME things do improve after taking those steps. But with large code bases there are a lot of layers of products over time and it's difficult (and expensive) to make sure everything is covered. And it's also a moving target.

> Also, the original task is described as a "refactor", so the numbers should not be affected - was was it not just a refactor?

IME the most useful tech debt interventions are where some legacy module is deleted or some unused code retired. Unfortunately those are often not provably without side effects and sometimes even with a diligent investigation side effects can be missed, especially when the components involved are old and original creators and product managers have left the company.

At the end of the day cleaning up tech debt has non-zero risk, guaranteed cost, and very often, in the eyes of management, negligible reward. So on average it grows and thus enterprise codebases are born.

Testing is a thing.
Ok, but now your “time boxed” half day refactor is two man weeks of testing, bug fixes, back and forth, etc
If you planned your refactor AND testing and bugfixes to take half a day, and it is going to go way over, you (a) tell your boss your estimate was off and the refactoring needs to be a separate task, and (b) revert it.
Technical Manager: Dev, it's only a quick fix because there are no test cases for that area, and you're not adding any before refactoring, so then you don't think you're breaking anything.