Hacker News new | ask | show | jobs
by di456 1078 days ago
It is important, but when the shift to cheap compute and storage hit an inflection point it became possible to build out wide tables that combine the characteristics of facts and dimensions, and real data modeling took a back seat. That possibility led to bias toward moving quickly with less emphasis on making the data model sustainable.

I'm seeing the pendulum start to swing the other way, where the complexity of these scrappy and loosely structured data models is hampering the ability to innovate, and even slowing down the business. The models are often inflexible and hard to maintain with hidden bugs and gotchas.

1 comments

Interesting, this makes me think of a process I am going through, where I have a couple of very wide tables where I feel a need to build fact tables simply to get a better understanding of the data and the domain they are from. I have already stumbled a couple of times as the groupings was not as expected at all. Think this was the last push for me to build those fact tables once and for all, to get an overview and also to be able to add my own inferred groups to the data.
Sounds like you are on the right track.

A non-technical benefit is there won't be so much context to keep in your head when triaging data issues in the future.

A database table is very similar to the separation of concerns problem of core software engineering. A table really just is an abstraction for storing and fetching various data points.

Finding the right balance is part art and part science.