Hacker News new | ask | show | jobs
by grayrest 4666 days ago
Datomic is interesting because it's a different take on what a database should look like. The TLDR version by someone who's looked into it a bit but not actually used it:

* Storage, Transactions, and Querying are separated as in different processes/machines separated.

* Data is immutable. Storage is pluggable and has implementations on top of Dynamo/Riak.

* Transaction semantics and ordering are controlled by a single process for consistency. This is the write scaling caveat. It's less of a restriction than it sounds (if you're thinking SQLite2 like I did) because there aren't writes/queries competing for resources, it's just the sequencing.

* Queries on the db are performed in-client and can interoperate with client code and state. When you write a query, datomic pulls the data from storage to the local machine and performs the query.

* Queries are in a logic programming language called datalog. Even if you aren't interested in the rest, I'll recommend spending an hour working through http://learndatalogtoday.org/ just for the exposure to logic programming.

1 comments

You mean the whole data is fetched to the client and only queried afterwards? Why did they choose this way?
Only the range of data the client is interested in is fetched from the storage layer.

As for why they chose this, you'd have to ask them to be sure.

But two reasonable assumptions are: 1. they wanted the storage layer to be "dumb", in particular so that they could use existing services like Dynamo. 2. they wanted reading processes to be totally independent. Readers can talk directly to the dumb storage layer without any centralized resource coordinator to execute queries. That means horizontal scalability in the strict sense.

Only partial indexes are retrieved (what is needed to answer your exact query). The bonus is that that data is now local. Transversing deep structures then often approaches the speed of hash-map lookups. As someone who has worked on very complex SQL databases, this is a major win.
Only the data you need is fetched (and cached) so the client only has a subset of the database.
indexes and chunks of data that are used often remain cached in each application instance, and new changes are streamed to the application cache.

It means that reads from a hot cache do not touch network. Reads are very fast and scale "out". You can write code that does a lot of reads without caring much about performance. (SQL reads only scale "up" and you care very much about their performance.)

Datomic is like Git (distributed reads, central writes); Postgres is like CVS/SVN (centralized reads and writes). This is made possible by immutable history.