Hacker News new | ask | show | jobs
by jameshart 3511 days ago
Do all these services provide equivalent or abstractable guarantees? Amazon S3, for example, provides 'read-after-write' consistency, meaning once you've received a positive response to a put operation, you can expect to be immediately able to retrieve that object. But it used to be 'eventually consistent', meaning it was possible to receive a positive response to a put, but then not be able to read the object immediately after.

Similar guarantees are needed around how soon after deleting something can readers expect to get 404s.

If these guarantees differ, you might find abstracting over the stores doesn't work the way you'd like...

3 comments

You can compare these guarantees and results from testing:

https://github.com/andrewgaul/are-we-consistent-yet

That is pretty old, considering what S3 has changed in the past two years. In particular, it talks about reading S3's us-standard region from the East and West coasts, and that has evolved to where us-standard (which is really us-east-1) is now the same as all other regions when it comes to consistency (presumably because all requests physically go to the East coast now, but I haven't tested that):

http://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction....

Fair point, I will update the AWS us-standard results and include Backblaze as well.
Per that AWS page, you might want to add an asterisk that you lose read-after-write if you make a GET or HEAD to the keyname before your first PUT. So checking if the object exists will punk you down to eventual consistency.
Thanks! We'll address this in our documentation.
S3 only has read-after-write consistency for PUTs to new objects (not overwrites), and if you do a HEAD/GET before the PUT that degrades into being eventually consistent.

http://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction....

It's so bad that big companies, like Netflix, have to write a whole service to store a copy of the s3 metadata: http://techblog.netflix.com/2014/01/s3mper-consistency-in-cl...

Netflix have turned the S3 metadata eventually consistent problem into s3mpr metadata eventually consistent problem. The difference is that they can now inspect and reason about s3mpr's metadata.

Spotify have had to do the same thing for Google Cloud Engine. I can't help wonder if eventually consistent object stores will go the way of NoSQL databases, when a consistent, scalable hierarchical filesystem appears.

GCS is a bit better, since individual objects are strongly consistent, but I've been bitten by the eventually consistent listing.
Also on GCS, if you do a HEAD after DELETE on a bucket that is under lifecycle management it returns 200 instead of 404. Not really a consistency issue but it can really come and bite if you if you're not aware of it. GET returns 404 but HEAD returns 200.

I reported it as a bug but Google said it was by design. More specifically they said: "You are correct, if the versioning enabled in your bucket then the object metadata is saved as an archive object in the bucket [1].This is the reason you are getting 200 for your HEAD request."

>I can't help wonder if eventually consistent object stores will go the way of NoSQL databases, when a consistent, scalable hierarchical filesystem appears.

Of course they will. Eventual consistency is a huge tradeoff that I don't think anyone would make if they weren't forced to.

Then you trade availability with consistency. No system can't escape CAP
Exactly. Availability and partition tolerance are often hard requirements, so we have to do all sorts of gymnastics to deal with eventual consistency. In personal projects where availability isn't a big concern and my most precious resource is my own time I tend to make different tradeoffs.
Maybe something like this - 1 million ops/sec on HDFS: http://www.logicalclocks.com/index.php/2016/10/14/hops-smash...
Very good question. Of course all services behave a little differently which is out of our control. To be honest, I don't have an exact answer on that. In my first tests I could chain a download directly after the upload with all services. But I don't know if that's because they're all guaranteed to be read-after-write or some were eventually-consistent just really fast. In general, if one service is read-after-write and you want to switch it for one that isn't you might get problems, unless you've programmed defensively (checking for existence before proceeding). Give us some time to check that and run some more tests.
That's kind of the point though. This abstraction seems doomed to be extremely leaky. Honestly, I'd rather just deal with them separately than be constantly fighting an API.