|
|
|
|
|
by xyzzy_plugh
1589 days ago
|
|
The prefix isn't delimited, it's an arbitrary length based on access patterns. A fictitious example which is close to reality: In parallel, you write a million objects each to: tomato/red/...
tomato/green/...
tomatoes/colors/...
The shortest prefixes that evenly divides writes are thus tomato/r
tomato/g
tomatoes
If you had an existing access pattern of evenly writing to tomatoes/colors/...
bananas/...
The shortest prefixes would be t
b
So suddenly writing 3 million objects that begin with a t would cause an uneven load or hotspot on the backing shards. The system realizes your new access pattern and determines new prefixes and moves data around to accommodate what it thinks your needs are.-- The delimiter is just a wildcard option. The system is just a key value store, essentially. Specifying a delimiter tells the system to transform delimiters at the end of a list query like my/path/
into a pattern match like my/path/[^/]+/?
|
|
I find the S3 documentation and API to be really confusing about this. For example, when listing objects, you get to specify a "prefix". But this seems to be not directly related to the automatically-determined prefix length based on your access patterns. And [1] says things like "There are no limits to the number of prefixes in a bucket.", which makes no sense to me given that the prefix length is something that S3 decides under the hood for you. Like, how do you even know how many prefixes your bucket has?
[1] https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimi...