Hacker News new | ask | show | jobs
by rossmohax 1884 days ago
Recent S3 consistency improvements are welcome, but S3 still falls behind Google GCS until they support conditional PUTs.

GCS allows object to be replaced conditionally with `x-goog-if-generation-match` header, which sometimes can be quite useful.

2 comments

Vogels spoke briefly about why AWS prefers versioned objects instead here: https://queue.acm.org/detail.cfm?id=3434573

BTW, DynamoDB supports conditional PUTs if your data can fit under 400 KiB.

How do versioned objects make conditional puts unnecessary? I see little relation between them, except that you could use the version identifier in the condition.
Because they let AWS offload the hard part to you, which is what AWS does best :)
This doesn't answer the question being asked.
Why would AWS provide a feature that makes additional transaction and storage charges (as well as subsequent reads to see which is the correct version) irrelevant?
S3 (and most AWS services) are extremely price elastic; i.e., the lower you make them cost, the more people use it (a la electricity.) That's why they've done stuff like drop from 50ms billing to 1ms billing, etc.
They could still offer it as a client library feature, just tell the users what kind of r/w amplification and guarantees they can expect, and it's something they could optimize later or not.
You can't reliably implement conditional PUTs in the client, you need a server side mechanism like the `if-match` header.
Conditional PUTs aren't possible with versioned objects, but if you desire immutability, then they do help.

To implement serialisation, (as opposed to using Conditional PUTs) one could implement a ledger on top of versioned buckets with LegalHolds: Basically, the object versions part of the main chain are LegalHolded whilst other versions are reconcilled (rebased) onto main and later deleted. Tricky to implement, for sure, compared to say, maintaining a journal in DynamoDB to track PUTs to maintain serial integrity.

There is a conditional CopyObject though (x-amz-copy-source-if...)

Can cover some of the use cases

Can you explain how this is useful? It seems like the destination is the important thing here not the source.
it can be useful for some coordination problems.

for example imagine that you have N writers and you want only the first of them to write something in a n object.

each writer writes their content into "_foo" and each tries to copy that to "foo", with "x-amz-copy-source-if-match". Only one of them will succeed and "foo" will have one consistent value, which all observers (including other writers) can agree who has won.