| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by JulianWasTaken 1145 days ago

I don't doubt your story of course, and I love Postgres, but comparing apples to oranges no?

Datomic's killer feature is time travel.

Did you simply not use that feature once you moved off Datomic (and if so why'd you pick Datomic in the first place)? Or are you using Postgres using some extension to add in?

3 comments

bvanderveen 1145 days ago

We implemented it in Postgres with 'created_at' and 'deleted_at' columns on everything and filtering to make sure that the object 'exists' at the time the query is concerned with. Changes in relationships between objects are modeled as join tables with a boolean indicating whether the relationship is made or broken and at what time.

Our data model is not large and we had a very complete test suite already, so it was easy to produce another implementation backed by postgres, RAM, etc.

wwweston 1145 days ago

Yeah, it seems you could be able to substitute thoughtful schema design avoiding updates/deletes for time-travel as a feature.

I wonder if anyone has made a collection of reference examples implemented this way (and in general think that a substantial compendium good examples of DB schema and thinking behind them could be worthwhile).

pjot 1145 days ago

It’s called a slowly changing dimension. In this example, it’s a type-2.

https://en.m.wikipedia.org/wiki/Slowly_changing_dimension

twic 1145 days ago

I'm moderately confident you could mechanically transform a time-oblivious schema into a history-preserving one, and then write a view on top of it which gave a slice at a particular time. Moderately.

ysleepy 1145 days ago

That is essentially what MVCC does.

JohnBooty 1145 days ago

Yes, although AFAIK those hidden MVCC columns (xmin, xmax?) aren't very usable from an application standpoint -- the obsoleted rows only hang around until the next VACUUM, right?

I realize you're not claiming those columns are useful from an application perspective. Just curious to know if I'm wrong and they are useful.

Because as I understand it, the selling point of Datomic is their audit trail functionality and that is admittedly a bit onerous to implement in a RBDMS. Even though I feel like every project needs/requires that eventually.

ysleepy 1143 days ago

I meant MVCC is the proof that you can automate the transform of a schema into a versioned schema. How and if the DBMS exposes that is another concern.

The garbage collection / VACUUM part of an MVCC system is the harder part, saving all versions and querying a point in time is the easy one.

twic 1144 days ago

Oracle lets you use the MVCC data to query past states of the database, called "flashback":

https://docs.oracle.com/en/database/oracle/oracle-database/2...

You can configure how long the old data is kept:

https://docs.oracle.com/en/database/oracle/oracle-database/2...

Worked examples:

http://www.dba-oracle.com/t_rman_149_flasbback_query.htm

twic 1145 days ago

That is an extremely good point.

mjul 1144 days ago

Snodgrass wrote a whole book on that topic: Developing Time-Oriented Database Applications in SQL (1999).

It is available as PDF on his publications page:

https://www2.cs.arizona.edu/~rts/publications.html

rehevkor5 1145 days ago

Maybe search around on bitemporal database table modeling.

jskulski 1144 days ago

Ive built a couple systems that would have been datomic’s bread and butter.

Each time the company was more comfortable with mainstream dbs, so we ended going with something like you’re talking about, built on top of a db. A couple of the projects were because a mainstream dbs wouldn’t scale.

The systems definitely worked, but it was also a lot of implementation complexity on an other wise simple business prop: “store this data as facts”

btown 1145 days ago

https://www.postgresql.org/docs/11/contrib-spi.html#id-1.11.... discusses a model for implementing time travel in Postgres <12 using SPI. https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit... discusses why it was removed in Postgres 12 - it seems logical that it's more maintainable to implement in plpgsql, though as far as I can tell there aren't off-the-shelf implementations of this.

We use https://django-simple-history.readthedocs.io/en/latest/ (with some custom tooling for diff generation) for audit logs and resettability, and while you can't move an entire set of tables back in time simultaneously, it's usually sufficient for understanding data history.

ithrow 1145 days ago

Datomic's 'time travel' is an audit feature, not something for your application/business logic to depend on. Performance reasons make it impractical, unless you only have like 10 users and very little data.

JulianWasTaken 1145 days ago

That's certainly not how it sells and markets itself.

The first feature on benefits (and the only reason I've ever heard Datomic brought up and/or considered it myself for production workflows) is using that stuff in application workflows: https://docs.datomic.com/pro/time/filters.html#history

Could be you're saying it in fact doesn't work well performance-wise, that'd (surprise me but) certainly explain why it's not more popular -- but I think it's clear it wants you to use this as an application feature.

newlisp 1145 days ago

Welcome to sales tactics ;)

Datomic is great but as another commenter said, is good for "small-ish backoffice systems that never has to be web scale". You almost probably can rely on querying history for internal applications. I think their primary market was for companies to use it internally but they never made this clear.

KingMob 1133 days ago

Ironically, Hickey fired the one marketer they hired for Datomic.

He lucked out when a unicorn went all in on it. Word around Cognitect was, Datomic was barely breaking even.

xmlblog 1145 days ago

> "small-ish backoffice systems that never has to be web scale". Doesn't production use of Datomic by Nubank and Netflix (to mention just two examples) belie this assertion?

KingMob 1145 days ago

Alternatively, Datomic wasn't performing up to snuff, and they found it cheaper to buy Cognitect than do a DB migration :D

bilkow 1145 days ago

Do those companies specify what they use it for? They probably have their own internal "small-ish backoffice systems".

oblio 1145 days ago

Nubank is one thing, but for Netflix, just like for any big company 10000 DB technologies are probably in use at the same time.

And 9996 of them are used for stuff like the internal HR DB or other minor projects.