Hacker News new | ask | show | jobs
by fragola 3592 days ago
It's a trade-off. There are a lot of reasons to duplicate (faster reads, etc.), with the big drawback being consistency. The duplications are basically caches, with similar pros and cons.

I think a good solution is to use something like DynamoDB and use hot/cold tables. Like, if you're storing something high read/write (i.e. today's olympic stats), keep it in the hot table with a frequent cache timeout. Later, when they become yesterday's stats, move them to the cold table which will basically be read-only, with a longer valid cache.

The problem with MongoDB is that it discourages this type of data management, at least how I've seen it used.

1 comments

If it's like Cassandra, then you can kind of get around consistency problems by, for instance if you have a "replication" factor of 2, by instructing it to write to "2 instances before returning" (and then you turn around and read it after that, if you want to double check). Though of course while it's writing to the first, then second replica, it'll be inconsistent, but not for too long :)
Yes, that is the difference between strongly consistent and eventually consistent.