Hacker News new | ask | show | jobs
by inopinatus 1589 days ago
There’s no delimiter. There is only the appearance of a delimiter, to appease folks who think S3 is a filesystem, and fool them into thinking they’re looking at folders.

The object name is the entire label, and every character is equally significant for storage. When listing objects, a prefix filters the list. That’s all. However, S3 also uses substrings to partition the bucket for scale. Since they’re anchored at the start, they’re also called prefixes.

In my view, it’s best to think of S3’s object indexing as a radix tree.

This article, as if you couldn’t guess from the content, is written from a position of scant knowledge of S3, not surprising it misrepresents the details.

2 comments

So if I have a bunch of objects whose names are hashes like 2df6ad6ca44d06566cffde51155e82ad0947c736 that I expect to access randomly, is there any performance benefit to introducing artificial delimiters like 2d/f6/ad6ca44d06566cffde51155e82ad0947c736? I've seen this used in some places.
To AWS S3, '/' isn't a delimiter, it's a character that's part of the filename.

So for instance "/foo/bar.txt" and "/foo//bar.txt" are different files in S3, even though they'd be the same file in a filesystem.

This gets pretty fun if you want to mirror a S3 structure on-disk, because the above suddenly causes a collision.

No difference other than readability. And amazon may distribute your application with another prefix anyway, like "2d/f6/ad6c"
I don't know what impact that partitioning pattern has on s3, but it has some obvious benefits if your app needs to revert to write to a normal filesystem instead (like for testing).
>There’s no delimiter.

What's the delimiter parameter for then?

https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObje...

To provide a consistent API response as part of the ListObjects call. It has nothing to do with the storage on disk.
To help you fool yourself. It affects how object list results are presented in the api response.
"To help you fool yourself" seems like a euphemism for "to fool you". It's gotta be tough to go from "scant knowledge of S3" to genuine knowledge if the documentation is doing this to you.

If the docs are misrepresenting the details, who can blame the author of the post?

The documentation is very clear on the purpose of the delimiter parameter.

The OP does not read the docs, makes bad assumptions repeatedly throughout, and then reaps the consequences.

They can’t present a directory abstraction for list operations without a delimiter. E.g. CommonPrefixes.