Hacker News new | ask | show | jobs
by treyd 455 days ago
Like most articles that make strong assertive statements like this, it's an oversimplification. Every tool has its place. The author clearly wants to use SQL, and seems to have a problem that would benefit from it, so they should use a SQL DB and not try to use a KV DB.
1 comments

He’s basically just asking for SQL without the query planner because it’s obnoxious and how often do you have unconstrained arbitrary queries hitting the database that you don’t have the chance to vet anyways? The database is hidden behind applications 100% of the time; why is this a predominant design use case

KVs give you this behavior, they just drop everything else with it

You usually want a query planner, else you'll end up writing a query planner yourself to produce efficient queries.

What is sometimes needed is query plan stability, lack of surprises, and influencing the planner. This is very attainable in the existing SQL databases and is a core feature of the older ones, like Oracle.

I’m not really talking about eliminating the query compilation step — that’s always useful. The dynamic query compilation however — the planner executing on every submitted query — is generally less so but fundamental to RDBMS goals. Query plan stability is the point — if you’re stabilizing the query planner, then you’re explicitly opting out of its dynamic capabilities. You don’t need a planner, you need a query compiler (in that it’s one and done, and you can even allow the thing the luxury of having actual time to optimize)

For proper data warehouses with multiple applications talking to it, planning is probably more useful than not. For the common modern situation with small/medium sized applications with a 1:1 relationship to their database, I’ve found it fairly rare that my data distribution changes significantly over time — and where it does, a lot more work needs to happen around it anyways.

Hints are the effective solution but they’re opt-in, weird non-standard extensions to the language and intrinsically tied to the planning engine; the big issue though is that they mainly exist to coerce/override the heuristics, which mainly fall over because planners are built to re-execute on every query with no real time to optimize/explore properly (you’ll give a C++ codebase hours to explore, but the RDBMS is granted mere milliseconds — a great injustice)