|
|
|
|
|
by gnat
1894 days ago
|
|
How mature is performance optimisation in Datalog? I've joined a place that uses SQL Server, and brought my open source perception of SQL Server as an expensive lumbering behemoth. I'm astonished by how fast SQL Server is, and how mature the tools around it are. We rely on them to tune queries, diagnose contention, etc. My big concern about adopting something like Datomic or Crux is that we'd lose the insight that lets us increase performance without increasing hardware spend. |
|
Crux is usually quite competent at selecting a sensible variable ordering but when it makes a bad choice your query will take an unnecessary performance hit. The workaround for these situations is to break your query into smaller queries (since we don't wish to support any kind of hinting). Over the longer term we will be continuing to build more intelligent heuristics that make use of advanced population statistics. For instance we are about to merge a PR that uses HyperLogLog to inform attribute selectivity: https://github.com/juxt/crux/pull/1472
EDIT: it's also worth pointing out that the workaround of splitting queries apart is only plausible because of the "database as a value" semantics that make it possible to query repeatedly against a fixed transaction-time snapshot. This is useful for composition more generally and makes it much simpler to write compile-to-Datalog query translation layers on top, such as for SQL: https://opencrux.com/blog/crux-sql.html
[0] https://cs.stanford.edu/people/chrismre/papers/paper49.Ngo.p...