Hacker News new | ask | show | jobs
by Aqueous 1392 days ago
Perfectionism is a pitfall, but agile development has led a lot of developers to prize velocity over correctness. Many seem to believe that any upfront mistake can be fixed through iteration. The truth is that there are things that you can't fix through iteration. Things like having an incorrect data model. Some things need to be perfect up front or you will quickly find yourself boxed in down the road as you try to extend the system. Once you've gotten the right design at the center of your system, understanding not only the current requirements but also likely future requirements as well, scrum away. It's just not OK to blow the "big picture" decisions that you need to do in order to build a piece of software the right way because you want to move fast.
4 comments

Incorrect is a spectrum, and perfect is not-- there's a balance to be struck. In some projects, having a perfect data model would require having a perfect design, and I don't know if I've seen that happen once (for anything non-trivial) in my entire career. Depending on the environment and requirements, data model changes aren't even necessarily problematic, let alone catastrophic. You might not even know what your complete data model will look like until further along in the project. In other situations, having to make changes to some base layer of interface functionality could end up being the concrete shoes when it comes to developer time sinks.

The impulse to get the instant gratification of cracking into those first few lines of code should never supplant thoughtful design, but I've seen things go really wrong in the opposite direction, too. I reckon it's all about doing a good risk analysis, considering the costs of having to rework something down the road, and not letting your fear of getting that wrong stop progress longer than it needs to.

When designing a complex system with multiple business modules and multiple services, correctness becomes a little more binary than that. Your data model tends to proliferate across anything that interacts with it - fixing it may require a complete refactor of all of those modules and all of those services. Throw in a process that is constrained by pesky things like needing to preserve WLB, limited surge capacity, and competing priorities, as well as varying levels of experience on your team, this rewrite can potentially take multiple years and require the maintenance of two parallel systems, and yet it must occur because the data model is foundational for everything else. Continuing to build on the wrong data model will keep boxing you in further until it's too late to ever fix it, with every new feature forcing you to fight with a fundamental flaw of your system. The resulting workarounds and hacks will slowly strangle your business logic until it's too complex to change with any confidence. It's kind of like building a house on top of quicksand.

When faced with such constraints, it's really, really important to get this right from the beginning.

Sure. I never said that data models needn't ever be perfect from the jump or that having a perfect design from the beginning would always be required. It really depends on the data complexity, structure, and how it's used. The important thing is to really think about it first.
I like this comment so much, I'm tempted to add a quote to the article
> The truth is that there are things that you can't fix through iteration. Things like having an incorrect data model.

Yes! I spent a year iterating on my prototype for my startup for precisely this reason. The data model kept changing and that had repercussions through every other part of the prototype. Often leading to reworking entire components.

The data model is that the core of everything for my prototype, so it was more than worth the time spent.

Indeed, this is precisely what makes getting it right up front so important. As mentioned above, the data model tends to proliferate across anything that interacts with it. The principles of normalization mean that there is usually one natural way to interact with a given conceptual entity, given current and potential future requirements, but thousands of ways to box yourself in. Getting it wrong can mean the wrong assumptions about the data shape propagate across thousands of lines of code, preventing the possibility of ever implementing certain features in a clean, maintainable way.
> there are things that you can't fix through iteration

> Yes! I spent a year iterating...for precisely this reason

Seems like you're presenting a counter-point while simultaneously agreeing.

I think the takeaway is "a year." Businesses typically don't have a year to fix something like this, so saying something will take a year usually means, "It's impossible."
Totally, but this is a full greenfield project. It's a pretty unusual situation to be in.

It definitely a first for me.

You deliberately omitted the word "prototype" from the quote. They didn't spend a year iterating in production. The prototype is the one you plan to throw away.
> Things like having an incorrect data model.

Absolutely. Take a look at git. It's basically just a Merkle tree which is an ingenius data structure with a very simple set of entities and invariants. Almost every feature of git can be viewed as merely manipulating this structure. If you rolled your own, you may get a fast mvp, but chances are that your data model will be very inefficient or complex for coming features.

This goes both ways. Git doesn't have eg tracking of empty dirs, as a result of the data model (to my understanding). Saying no when it doesn't fit the model is also an important aspect.

As such, the data model can't be decoupled from almost anything else, including the UI. It's well worth spending some serious time researching and planning that. Or, if you're still not confident, you can always prototype a quick-and-ugly POC which contains your main features – that usually helps inform and build intuition about the problem.

Anything can be fixed with iteration. Some just take more iterations than others. We have the benefit of working with text files instead of wood, granite, or the human body.
Right, but in the constraints of a business, "more iterations" may (and probably does mean) "impossible." Second, the iterations to fix the problem block all your iterations to add new functionality - because the data model is foundational for everything else.