| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by A_Beer_Clinked 3966 days ago

Lots of databases are configured to do both. The tables store what we normally think of as "the data" and the log stores the changes. Tables are like the HEAD in git etc and the transaction log is like the chain of commits.

In principle you could just query the transaction log for every change to your data and compute the final state every time. Obviously this would be onerous so in normal operation we just use the latest state.

When things go wrong the transaction log is useful for understanding why and also rewinding/replaying the database to the correct state.

Some databases ship these transaction logs around between replicas to keep them all in sync.

The work presented here is an interesting application of the same basic mechanism to keep different flavours of datastores in sync.

Recently we very briefly explored the idea of using this mechanism to implement partial replication for partitioned reporting data stores. Unfortunately our current platform SQL Azure doesn't grant access to the transaction log directly. (Which on balance this is a good thing because it's handling all the replications etc)