Hacker News new | ask | show | jobs
by int_19h 2744 days ago
Because of how different they are, taking advantage of them requires using them in specific ways that are also very different. If you have an abstraction layer that hides those differences, it's practically a given that it does so with the lowest common denominator approach, where you get all the flaws of both and none of their unique benefits.

(Alternatively, people find ways to use the abstraction layer such that it produces the desired usage pattern. Of course, then the code is no longer truly portable to another DB, because that same pattern is likely to be a perf issue there.)

2 comments

I'm pretty sure you can have one interface for YeSQL an NoSQL.

* getUsers() * saveUser() * listUsers() * deleteUser()

and implementation can implement it in specific way to take advantage of chosen technology. If you have some esoteric use cases they could be handled in special way separate from business code.

For that specific use case, maybe. Where it falls apart is in `searchUser`, where the methods (and performance characteristics) of digging through the respective databases are going to be radically different. In a newspaper's implementation, you're going to have to search by date, subject, keyword, body text string, reporter, etc. etc. In MongoDB, that generally involves creating an index on the combination of fields you'll be searching together. In SQL, that generally looks like adding a `where reporter = "Jane Smith"` predicate. The MongoDB version may be faster if you have an enormous amount of data spread across a cluster. The PostgreSQL version will be more flexible when your boss wants to know how many reporters wrote stories about candied yams within three days of the publication of any story containing the word "Thanksgiving".

Being tasked to come up with an abstraction layer that supports the speed of precomputed, clustered indexes with the flexibility of SQL - if I were in a content creation business and not a database engine writing business - sounds like the kind of project that would make me quit my job and go on a one year silent meditation retreat.

That objection doesn't make sense. Queries are nothing more than a tree of predicates, how the backend end uses those predicates is not relevant to the API of specifying the predicates. Things like indexes whether in Mongo or in SQL are implementation details that can easily be hidden and not infect the API. You can interpret the tree of predicates into a SQL where clause or into a Mongo index search.

The OP is correct, your app can speak to an internal API without the underlying database infecting your domain code. That in no way implies you can't take advantage of the best of each database.

This is all rosy in theory. In practice, the way you write the query matters quite a bit. Often even between different SQL implementations.

And it's not just queries. Transactions often have important semantic differences that will be visible on application layer - again, even between different SQL implementations (e.g. MVCC vs locks).

> In practice, the way you write the query matters quite a bit.

Which is hidden in the query interpreter for said db implementation. Each implementation can break down that abstract query into whatever implementation specific query works best in that database.

There's always some abstract way to represent it that doesn't require vendor specific knowledge nor does it remove the ability to apply vendor specific abilities.

Look, I just don't agree with you, I agree with OP. Db specific stuff should be hidden from the domain layer by an abstract query representation and an abstract transaction representation to be plugged in at a later time.

Have you ever implemented an ORM? I did.

Stuff like "each implementation can break down that abstract query into whatever implementation specific query works best in that database" is wishful thinking. It's like saying that Java is faster than C++, in theory, because JIT can produce better code. And in theory, it can. In practice, we're not there yet. Same thing with high-level database abstractions - they're all either leaky in subtle ways, or they constrain you to extremely basic operations that can be automatically implemented efficiently on everything (but e.g. forget joins).

It's not a common denominator. If I need a Customer aggregate or a Payment aggregate it doesn't matter if I get it from SQL or Document database as long as I get that aggregate. My code doesn't care about query implementation.