| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by xyzzy_plugh 1589 days ago

The prefix isn't delimited, it's an arbitrary length based on access patterns.

A fictitious example which is close to reality:

In parallel, you write a million objects each to:

   tomato/red/...
   tomato/green/...
   tomatoes/colors/...

The shortest prefixes that evenly divides writes are thus

   tomato/r
   tomato/g
   tomatoes

If you had an existing access pattern of evenly writing to

   tomatoes/colors/...
   bananas/...

The shortest prefixes would be

   t
   b

So suddenly writing 3 million objects that begin with a t would cause an uneven load or hotspot on the backing shards. The system realizes your new access pattern and determines new prefixes and moves data around to accommodate what it thinks your needs are.

The delimiter is just a wildcard option. The system is just a key value store, essentially. Specifying a delimiter tells the system to transform delimiters at the end of a list query like

   my/path/

into a pattern match like

   my/path/[^/]+/?

2 comments

stepchowfun 1589 days ago

Thank you! This is the first explanation that I think fully explains what I was confused about. So essentially the prefix is just the first N bytes of the object's name, where N is a per-bucket number that S3 automatically decides and adjusts for you. And it has nothing to do with delimiters.

I find the S3 documentation and API to be really confusing about this. For example, when listing objects, you get to specify a "prefix". But this seems to be not directly related to the automatically-determined prefix length based on your access patterns. And [1] says things like "There are no limits to the number of prefixes in a bucket.", which makes no sense to me given that the prefix length is something that S3 decides under the hood for you. Like, how do you even know how many prefixes your bucket has?

[1] https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimi...

link

xyzzy_plugh 1589 days ago

The sharding key is an implementation detail, so you're not supposed to care about it too much.

link

kristjansson 1589 days ago

That's true now. Used to be the case that they'd recommend random or high-entropy parts of the keys go at the beginning to avoid overloading a shard as you described above.

From [0]:

> This S3 request rate performance increase removes any previous guidance to randomize object prefixes to achieve faster performance. That means you can now use logical or sequential naming patterns in S3 object naming without any performance implications. This improvement is now available in all AWS Regions. For more information, visit the Amazon S3 Developer Guide.

[0]: https://aws.amazon.com/about-aws/whats-new/2018/07/amazon-s3...

link

xyzzy_plugh 1589 days ago

Indeed, and unfortunately my mind will forever work this way.

link

inopinatus 1589 days ago

It is related, in the sense both “prefixes” are a substring match anchored at the start of the object name. They’re just not the same mechanism.

link

chrisjc 1588 days ago

> So suddenly writing 3 million objects that begin with a t would cause an uneven load or hotspot on the backing shards.

makes sense

> The system realizes your new access pattern and determines new prefixes and moves data around to accommodate what it thinks your needs are.

What does "determines new prefixes" mean? Obviously AWS isn't going to come up with new prefixes and change object names.

So does AWS maintain prefix-surrogates (prefix sub-string(0,?) references) and those are what actually gets shuffled around to handle the new unbalanced workload? Sort of like resharding?

Moreover, since it's really prefix-surrogates being used, the recommendation of randomizing prefixes can be replace with randomizing prefix-surrogates and delegated to AWS, removing the prior responsibility from the customer. Hence the 2018 announcement https://aws.amazon.com/about-aws/whats-new/2018/07/amazon-s3...

link