|
> One big anti-pattern I've seen with ORMs is that developers who don't really think in terms of data and relationships use the ORM as a kind of object serialization usable only from the application. Rather than thinking of the database as something useful that could be queried and accessed outside the context of the application, they write objects out to various tables and then re-import and re-construct them once they're needed again in the app, often with dependencies that are in code or even yaml configuration files. The upshot is that you simply can't use the database as anything other than a persistence tier for an app. It really would be no different if they had simply given the objects and id and pickled them to disk. The resulting data store really is that inaccessible and meaningless outside the context of the app. > As a result, if an analyst wants a report, they can't write SQL to get those reports, even though the persistence tier is, if perhaps in name only, a "relational database". And when the app goes away (as it inevitably does), they'll still want to know, say, how many beakers and test tubes were ordered by a lab tech who participated in 100 or more experiments per year with at least one faculty member from radiology between June and December in 2007. > But because they don't really know SQL, they see no value to it, and they're honestly just kinda irritated that they don't have an object database, which is what they understand a database to be - a way to pickle and reimport objects. The application outlasts the developer, and the data outlasts the application. I'm on the developer's side here. Databases are too big and complex to be used by more than one application; if you have a database then it's really important to have a single owner for that database, or you'll never be able to evolve the schema. If you let an analyst write their own queries against your database, you have no control over what queries are running from where or for whom, so you can't so much as rename a column. And sooner or later your production system will lock up because an analyst wrote a query that they didn't realise had too many joins in. If it's important to have an archival record of your data, put that in your design constraints and build it into the system. Even then, I probably wouldn't pick an SQL database as the system for doing it with. |
> it's really important to have a single owner for that database, or you'll never be able to evolve the schema...
IMO, the "owning" application/developers reserve the right to evolve the schema-and if that temporarily breaks ETL, then so be it, but the underlying schema itself shouldn't be hostile to analytics/etc.