I've never really gotten the appeal of "schemaless" databases. Your data has a schema, whether you're explicit about it or not, but being explicit allows you (and forces you) to reason about it at the right time at the right level of abstraction.
A big problem is MongoDB isn't schemaless, it is document oriented.
That is a really big schema choice imposed on developers and the root cause of for instance why it took years to be able to create good indexes on inner fields, and they still can't do it efficiently.
The same can be said for simple key value stores.
True schemaless databases are much rarer and provide developers orders of magnitude increases in terms of flexibility, expression and speed.
Unfortunately these kinds of databases are not readily available to a broad audience. It's also the case that for the majority of jobs and developers SQL based stores are still a very good choice unless you need to deal with huge amounts of data from disparate sources without bringing down the system.
Schemaless databases allows you to model your data in a way that is more natural to your domain. The table-relational model used by SQL databases isn't always suitable for all types of data, most notably trees and graphs.
Don't get me wrong, I'd probably pick a traditional RDBMS over MongoDB for most projects, but the object-relational impedance mismatch is a real problem.
> True, but all those migration scripts/patches/table locks add complexity and inconvenience (less agile).
I don't think so. You could say exactly the same thing about maintaining tests for your code. Explicit schemas and constraints (and the effort that goes along with maintaining them) are very much like having tests for your data. They both help ensure that your code actually works when it needs to.
Last time I worked with PostgreSQL, it would lock entire table while adding a new column. Imagine users table, then you have to schedule this kind of altering outside business hours... which impacts devs (they have to stay late and/or come early) and business potentially, it has to wait till next day to have a feature delivered.
All these are trade-offs and everyone decides what is more important for them.
Certainly, but that doesn't mean the schema has to be a strict validation encoded into your storage format. It's a perfectly well-defined programming model to say "well, I'm reading query X with schema Y, and if some rows don't match Y give me nulls instead".
"well, I'm reading query X with schema Y, and if some rows don't match Y give me nulls instead"
Seems like a recipe for disaster to me, but well... A database isn't a "storage format". It's most often the single source of truth for a set of information.
Not being fully sure what data you expect from that source of truth and yet being able to query it is really dangerous. What if you start to update this data after having nullified things you didn't understand ?
Schemaless databases are good for scenarios where the database isn't a source of truth. If you have a table full of e.g. per-second heartbeats from a bunch of deployed services, there's no fundamental underlying truth anyone's trying to gather from it, and you can't afford to run a full schema migration every time someone adds a new metric.
I recognize some people do try to use schemaless databases in the way you're describing, and I agree that's weird and dangerous.