Hacker News new | ask | show | jobs
by mirsadm 5228 days ago
Any suggestions for what to do in this situation?

I was unfortunate enough to work at a place with a 10+ year old code base where there have been 3 to 4 software architects that have all had their own vision over the years. The code has been refactored many times already. It consisted of a buttload of singletons that were all initialized at seemingly random places. It was impossible to start or shut down the app deterministically. It was massively multithreaded (in most cases for no good reason). They had this horrible implementation of a shared cache (because many instances of this app ran across many computers that needed to share state).

It had no unit tests. Whenever you change anything it would break something entirely different. To me it was an example of everything going wrong...many times.

The best part was that it had to run for weeks because it was a critical application (for fire, police and military services). It didn't. They were instructed to reboot all the machines every day. But it wouldn't start up properly every time so they may have to do it multiple times. I did my best for the year to clean up the critical parts of it. I was contracting there but they had offered me a full time position by the end.

I finally cracked it when the original "software architect" came back because he had been fired from his old position. He was annoyed that over the years his code had been changed so he started to put back in place everything that he had done before he left. Not only that we had an angry complaint from one of the customers about a possible hostage situation that could have turned very badly because of this crappy system.

We tried to reason with the software manager by suggesting that we assign one or two people to start from scratch and take across code that can reused. It would take a bit of time to reach the same feature set as the old application but it had become so difficult to add new features to the old one that it needed to be done.

2 comments

If you're dealing with a complex system that needs a lot of work, I would recommend trying to break it down into subsystems and handle the subsystems one at a time.

The original article does assume that you're dealing with a system that more-or-less works. If your application flatly doesn't do the right thing, than a big rewrite from scratch may really be the best choice - there's nothing to save.

But spaghetti code doesn't just fall from the sky, it occurs because of politics, stubbornness, and bad processes. Over the course of the 10+ years it took to make the code base, is it really the case that nobody but you noticed the problem? It's more likely that there's a lot of pressure going in the opposite direction, and other maintainers didn't know what to do either.

Dealing with subsystems helps you handle both problems. Management might not be willing to let you rewrite the whole thing, but if you said, "Let's just fix the boot-up process for the Cyclotron 4000 resource. Nothing else changes, just the Cyclotron boot." you might be able to get permission. In a badly-maintained project, it's hard to replace all the instances of one service - that's what makes it 'badly-maintained' - but it's still easier than dealing with the whole system in one go. And, of course, instead of 'fixing' the Cyclotron you're actually rewriting it with a new, non-wacko Cyclotron service.

Then you go back to your manager and say, "It was rough, but the Cyclotron 4000 no longer blocks the start-up. Let's get the next thing on the list." Not only do you have a slightly better project, you also have better credibility with management, which makes it more likely you'll be listened to when you say a certain technical measure is necessary. Next, fix the subsystem that talks to the Cyclotron - and so on. Pick a right time to introduce tests, code review, and all the rest.

Remember that just as you had the experience working with the terrible code base, your managers had the experience of working with the previous 3 or 4 software architects who "had their own vision" and delivered a product that doesn't start up reliably - I don't think it's surprising that there was no longer the political will to assign people to refactoring or rewriting tasks. Bad architecture uses up the political will needed to approve good architecture, because it makes all "architecture" tasks look bad. You need to regard you reputation as a finite, under-supplied resource just as much as your time and budget and plan to get more.

From your use of the past tense, it looks like you're no longer in that situation (good for you!)... but that would be my advice if you see a similar situation in the future. I've used this plan in my own career to rewrite a (much smaller, only moderately troubled) project piece by piece over the course of a year.

Good response! Another part of the problem was that the company preferred contractors over full time staff. It had very high turn over because of this. Many have already replaced subsystems with their own versions over time. There had already been many implementations of the Cyclotron subsystem :). To be honest I probably ended up being one of them. Working there was too stressful and the rewards for trying to achieve more were not recognised.

The place had a reputation for hiring highly motivated engineers and burning them out. Just to be replaced by another. When I left, they hired a very talented guy that I worked with for a couple of months. He left recently and the cycle begins again!

You could probably replace all those singletons with a sane DI system in a week, why couldn't you?
It would have been possible but probably not in a week. The code base was written using QT and half the singletons would be lazily created at seemingly random places through hundreds of signal/slot calls (sometimes through the event queue if it came from another thread).

The singletons were just one of the many problems. I remember there was a "database.cpp" file which handled all access to the SQL database. It was over 10k lines of code and had hundreds of structs to represent all the tables in the system. The person responsible for that ensured he had a job by only working with that source code.

Wow.

This makes me thing that Java is better for this kind of big "enterprise" application, not because it's faster or more enterprise or somesuch, but because it's more limited, and therefore less things can go wrong.

I worked in two banks, developing web banking in one and middleware service in the other, and while there were some strange things (what's with banks and XML, really?) there was nothing that terrible here.

But then I'm pretty sure that someone will share their Java horror story.

I could write a book about how bad everything was set up :). This place loved to abuse XML. It had > 2000 XML files to configure the system. Objects in code were "generic" and instantiated based on XML configuration. You could inherit configuration from base XML files. It was basically impossible to determine where a piece of the system was set up from.

As bad as everything was it was a fantastic learning experience for me though.