Hacker News new | ask | show | jobs
by _red 3168 days ago
I've managed developers for the same software product (accounting system) over the last 15+ years. That amount of time gives you some perspective.

A common thing that new hire developers do is call for "a complete rewrite", they do this because when they first approach a large old code base, its daunting and seems impenetrable. Of course they are right, but naive in thinking a "rewrite" will help. Any new rewrite will eventually just grow to be just as impenetrable once all features and edge-cases are accounted for.

Fundamentally, any software product is trying to model some aspect of the real world...and the real world is messy, very messy. Governments pass laws that contradict each other, some laws change drastically state by state, employees try new and novel ways to embezzle, different languages and units of measure exist, changing prices for commodities can suddenly cause complete upheavals in manufacturing process, etc. All this must try to be accounted for and its nearly an impossible task.

The bugs that persist are almost never "I click Button A and it does the wrong thing", but almost always "In case that Situation A + B + C all simultaneously exist, the result as interpreted by Agency X is not optimal". Obvious and real bugs get squashed pretty quickly, but those complex situational bugs can linger for a long time. As a manager, you sometimes just need to shrug, because the effort required to fix each and everyone of these would produce little to no tangible business value. Moreover, an environmental change could come along to render your "fix" invalid anyways.

Sometimes even during design discussions we are completely aware we are creating "a bug", but the decision is made that the amount of people that want both Feature A and where Situation B exist will produce relatively little overlap. Most often we just design a manual workaround, instead of trying to completely eliminate the bug.

I'm always refreshed and excited by dealing with young devs, particularly for their zeal to fix problems, simplify things, and generally improve the product. Yet, I do feel a bit of sadness in that I know reality is going to temper their enthusiasm after a decade or so. Reality is a very hard thing to model with any semblance of being "bug-free".

2 comments

> once all features and edge-cases are accounted for.

But one of the benefits of a rewrite is that you can dump all the features and edge-cases that are no longer required. Or fold old edge-cases into new generalities because the business has changed since then.

> the real world is messy, very messy.

Cannot disagree -but- it's nowhere near as messy as the people (often those who are to blame) defending the byzantine software stacks using that argument.

> reality is going to temper their enthusiasm after a decade or so.

I've been doing this professionally for two decades and my enthusiasm for "chuck it away and do it right" hasn't waned one bit.

>> the real world is messy, very messy.

> Cannot disagree -but- it's nowhere near as messy as the people (often those who are to blame) defending the byzantine software stacks using that argument.

This. As a still relatively young developer, I can almost guarantee you that the initial reaction of "nuke it from orbit!!!" doesn't come from a couple of minor abstraction problems. You get this reaction when every second bug you try to fix ends in a trip to Klendathu.

>> reality is going to temper their enthusiasm after a decade or so.

>I've been doing this professionally for two decades and my enthusiasm for "chuck it away and do it right" hasn't waned one bit.

I think it's probably somewhere in between the two extremes. I think you should have good unit tests and then refactor parts of your code where you see better generalities, or where basic code cleanliness was disregarded before.

But throwing all of it away is rarely possible without endangering the profitability of the company for a while.

> throwing all of it away is rarely possible without endangering the profitability of the company

Well, obviously I don't mean "turn it off and wait for the new system to be finished". You build the new one whilst the old one is in maintenance mode and swap in new bits as and when you can.

For example, at current $WORK, the backoffice system is a horror show of overcomplex PHP that is riddled with bugs and no-one really understands how it all works. Replacing that would be a huge boon both humanly and monetarily to the company because CS use it heavily every day.

> You build the new one whilst the old one is in maintenance mode and swap in new bits as and when you can.

Continuous incremental improvement of a production system may, over time, have the same net effect as a an idealized big-bang replacement, but it's a very different process (it's usually what people who are saying you should never do a ground-up replacement prefer instead, because actual big-bang replacements, unlike idealized ones, are usually a shitstorm: and the reason is that they are usually done to the kind of systems you describe, overcomplicated key systems with inadequate documentation or institutional memory, and they are done instead of trying to get a firm grasp on each component of the existing system before replacing that component. And so they end up, at best, being exceedingly well designed, but overlooking key elements of business function discovered and implemented, but not durably documented, in the original system.)

>I've managed developers for the same software product (accounting system) over the last 15+ years.

You will probably appreciate this little piece of anecdata, last July there was an update change in some fiscal Laws in Italy, so that a number of firms had to make a certain payment of taxes within the 31st of July, BUT the change was communicated/published almost "last minute" (as often happens) and a software house had to update their accounting program in a very strict time. The payment code (on the government side) was the same of another payment (already known to be due on the 31/07/2017) so the programmers, in order to distinguish the two payments "anticipated" (virtually) the date on the database, so that two payments were resulting, one on the 30th and one on 31st.

This (intentionally) "queer" behaviour was not explained (or not explained well enough) to the users.

Most users "trusted" the program and everything went well for them, those that noticed the anomaly managed to "force" both payments on the same date, and this resulted in a "single" payment (instead of the two separate ones required), thus messing up the whole thing.

This is the future of learned helplessness. People that see problems and inconsistencies in software systems and try to work around them will have worse outcomes than those that just blindly follow the workflow assigned to them.

Seems like one needs to be controlling the spec and writing the code not to get stuck in this trap.

>People that see problems and inconsistencies in software systems and try to work around them will have worse outcomes than those that just blindly follow the workflow assigned to them.

Well, not always.

That applies ONLY to those that find such problems or inconsistencies and workaround them in an incorrect manner.

And this brings us back right to Chesterton's fence:

https://en.wikipedia.org/wiki/Wikipedia:Chesterton's_fence

that can be invoked both when users do silly things, but also when programmers do them.

Never heard of Chesterton's fence before, and it's quite interesting, but I don't see this as applicable here. This was a design error on the part of the programmers, because the software didn't make it clear why it was "misbehaving".
> This was a design error on the part of the programmers, because the software didn't make it clear why it was "misbehaving"

Partly yes, but only partly, as they did publish the "peculiar" workaround that they (the programmers) used in the update (though not giving to it the relevance that should have been given to it), but NO user actually reads the boring text that comes with the updates of course.

In this peculiar case the non-reading users were divided in three:

1) non-reading users that didn't even notice the anomaly

2) non-reaading users that noticed the anomaly and that either read the accompanying text or called to ask why the anomaly presented itself and were given a reasonable explanation.

3) non-reading users that noticed the anomaly, but, assuming that the programmers were a bunch of good-for-nothing morons [1], forced or "overruled" the settings without asking anything (and of course without even asking themselves if they were possibly causing an issue later on)

Of course both the #1 and #2 were fine, with the difference that the #1 were simply lucky, whilst the #2 "deserved" their success, as they had the curiosity to delve deeper in the issue.

The #3 are the main reason why I posted the Chesterton Fence reference, but it is applicable more generally.

Now that they were (all, users and programmers, in different ways) bitten by the issue, most probably the programmers in next release will add a field to the database so that you can have more than one payment with the same government code on the same day.

Still I can bet that in a few years the new kid on the block (among the programmers) will notice that there is a field in the database that is always set to 1, the memory of the reason why that field was added will be lost and he will probably remove that field by saying "Ha! I optimized the database by removing an unneeded field." and falling in the same fallacy.

[1] BTW, not that the opinion specifically was completely wrong, though I am not a programmer (nor an accountant) I had to deal with some of these guys to import some inventory data coming from another accounting program and it was a nightmare.