Hacker News new | ask | show | jobs
by justinsb 3981 days ago
I think it is interesting that AWS seems to be moving to consistent data-stores. Previously they were championing eventual consistency everywhere, even when it made for painful products (SimpleDB) or painful APIs (retry loops when using EC2 APIs).
4 comments

My impression is that they have several different backend data stores, each with different trade-offs and consistency models, and they choose the one that makes sense for each app/service.

The way Werner describes the ECS data store sounds very similar to Google's Megastore:

To achieve concurrency control, we implemented Amazon ECS using one of Amazon’s core distributed systems primitives: a Paxos-based transactional journal based data store that keeps a record of every change made to a data entry. Any write to the data store is committed as a transaction in the journal with a specific order-based ID. The current value in a data store is the sum of all transactions made as recorded by the journal. Any read from the data store is only a snapshot in time of the journal. For a write to succeed, the write proposed must be the latest transaction since the last read.

I believe raw eventual consistency has failed as a programming API. I believe CRDTs in their many incarnations provide a great alternative but i) CRDTs are quite new and ii) they require a more complex API.
I don't know, S3 and DynamoDB are both eventual consistent. And keeping in mind the CAP-theorem it makes sense. And I for one love SimpleDB - it's just that, simple. And great for prototyping (really cheap) and small production-loads. Often you just need a place to stick your data, scalability can be achieved to adding a caching layer.
You can choose eventual or fully consistent in DynamoDB. Given that full consistency comes at a higher cost (read from a quorum of replicas) we expose that cost to you.

BTW nobody wants eventual consistency, it is a fact of live among many trade-offs. I would rather not expose it but it comes with other advantages ...

They have actually been making S3 _more_ consistent over time: in the newer regions you get e.g. read-your-writes for object creation. DynamoDB also supports consistency, though still defaults to eventual consistency if you prefer.

In my mind, there's definitely a trend towards consistency here. I'd love to see an AWS blog post about the reasons behind this!

We should be glad. Eventual consistency is hard to reason about, particularly when its tradeoffs have to do with other people's systems...
US Standard now provides read-after-write consistency when accessed through the Northern Virginia endpoint [1].

[1] http://aws.amazon.com/s3/faqs/#What_data_consistency_model_d...

i think you can force consistency on DynamoDB for a price.
What are some other examples.

In general consistent vs available (or neither, that's possible too of course) is a trade-off and you'd want to pick one vs other depending on your business case.

I know I'm bound to be proved wrong here, but I think _every_ product after the original set (EC2, S3, SimpleDB) has been consistent or now has a consistency option: the major ones are RDS, EBS, EFS, DynamoDB, RedShift, Elasticache, Route53, Kinesis, SES.

Some of those APIs are sort of odd, admittedly, and could be covering up eventual consistency under the covers (Route53 in particular springs to mind there!)

Edit: And S3 and SimpleDB now expose more consistency than they did at launch.

DynamoDB encourages eventual consistency by charging .5 for each read capacity unit consumed on an eventually consistent read.
I consider consistency a bargain at twice the price :-)