Hacker News new | ask | show | jobs
by tosh 4131 days ago
I'd be interested in what Datomic's approach looks like in comparison.
3 comments

AFAIK, Datomic doesn't have the same kind of distributed load management out of the box for you, where you can do analytical queries over the whole data set.

Datomic is designed such that load from analytic queries is local to the quering machine (apart from storage retrieval which can be cached and replicated), thus it does not need to be ran on a shadow server or some distanced system from the live production system.

Because the data is immutable in Datomic, there is no need for worrying about locks on tables or documents for contention of future writes--since it reads the data at the time of the query starting from immutable files (joined with the database transaction 'novelty' buffer/log since the last database indexing operation).

It is also up to the querying machine to store the conclusions from such queries in whatever way they like. (If this means going to another datomic instance, or put on HDFS or GFS, by all means.)

A top of mind comparison.

Citus is a distributed relational database vs. Datanomic is a non-relational database.

Citus uses SQL as its query language vs. Datanomic uses Datalog.

Both support joins.

Citus is built on PostgreSQL as its data storage layer. Datanomic supports multiple storage layers.

There are more differences that can be found via Google.

From what I gather, underneath Datomic is an event sourcing database, which is a model that already scales "for free".

Further optimization is the fact the query engine lies on the application, so if you have N application servers you have N CPUs available for querying - as opposed to overloading a master server or having to provision read slaves.

It can be used for event sourcing, but it does prune past data (for single-value values, compared to set-like values) in the current database indexes. Old data is still available in older indexes.