Hacker News new | ask | show | jobs
by L0stLink 1801 days ago
RDBMS and Non-RDBMS both have there place, I have used both in the same system several times, all for things that they were good at. Transactions allow you to be confident while making complex changes that in case a failure occurs all partial changes will be rolled back. making use of database level validations and enforcing referential integrity is essential for keeping data consistent over the long term and making data migrations easier. Sure in trivial applications you can just dump data to a document store and have a validation soup handle the rest, that too can be implemented cleanly but Knowing when to leverage RDBMS over other DBMS is an essential skill for an engineer. Because that is how you build scalable systems. Not by throwing JSON stores at everything and calling it a day.
1 comments

If scalability is your concern then you can't use any of the supposedly core features of an RDBMS, since fundamentally there is no way to have a transaction across multiple nodes without solving a much bigger problem.

Validation is vital but the datastore is not the place to do it, because handling invalid data by dropping it on the floor is almost never the right behaviour.

There is no substitute for actually understanding your data model, but once you do 99% of the time you'll find using an RDBMS comes with minimal benefits and significant costs.

My experience is the exact opposite: once you understand your data model, 99% of the time you will find that NOT using a RDBMS comes with minimal benefits and significant costs.
It is extremely difficult to design a good RDBMS schema without understanding the data model, and once you do, it is there documented in its entirety with best in class tooling for anyone else to come along, pickup and be up to speed with it, additionally you don't have to forgo document storage, most if not all modern RDBMS suppord json(b) types.
RDBMS tooling is a long way away from best-in-class, and the data model is extremely awkward in a way that actually distorts your modelling (no collection columns, no sum types...) and there's no real support for keeping track of schema evolution. I agree that recording your schema model explicitly and keeping track of it is very important (using something like Avro's schema registry), but RDBMS tools are not actually that great at it and using an RDBMS brings in a lot of other baggage.
The point is that you can go pretty far on a single cluster (GitLab example). That 99% figure is trivially wrong in that case.