Hacker News new | ask | show | jobs
by valarauca1 4004 days ago
Query optimization is difficult because of the abstract structure and limited indexes. So you may query an index that holds EVERYTHING, and doing the query backwards would be faster... This'll depend purely on what you've inserted to the DB up to this point. Or more-so how you insert things into the DB.

Don't run an SQL server as your KV store you'll likely screw up the config and performance will suffer. If you want competitive performance with other DB's you will likely end up running memcache between your KV and Query Engine(s).

Don't store data over 1KB. Yes, the database can technically handle them, but in real world applications and expected speeds it can't.

B-Tree Syncs can be slower then you think in surprising number of cases.

1 comments

Could you say a bit more about the query optimisation problems you've run into?

The "put your most restrictive clauses first" rule (which is reiterated on the new Best Practices page) usually seems to do the trick in our hands.

I think Datomic is a very interesting project for a variety of reasons like use of Datalog, immutability etc.

This particular thing seems like a backwards step though. One of the major reasons why relational databases became so popular was that: the user didn't have to think about which predicate to put first in the where clause, or which join to do first. Query optimizers can do that much better, and there is decades of research on how to optimizer relational queries (including Datalog queries -- see Deductive Databases). It is hard for users to make such decisions, and complex queries, views, runtime parameters make it near impossible to reason about performance of different queries.

I'm sure they'll get to it eventually.
One thing to remember in high write environments: if you're storing a uuid attribute and it's indexed, use datomics squuid (sequential uuid). Rewriting the indexes will go MUCH faster and prevent latency spikes on queries.