Hacker News new | ask | show | jobs
by ffsm8 271 days ago
That's like saying "how to get to the moon is obvious: traveling"
2 comments

Thank you for setting me up for this...

It's not exactly rocket science.

Haha, good one!

I still feel like you're underselling the article however.

Is obviously ultimately parallelism, but parallelism is hard at scale - because things often don't scale - and incorrect parallelism can even make things slower. And it's not always obvious why something gets slower by parallelism.

As a dumb example, if you have a fictional HDD with one disk and one head, you have two straightforward options to optimize performance:

Make sure only one file is read at the same time (otherwise the disk will keep seeking back and forth)

Make sure the file is persisted in a way that you're only accessing one sector, never entering the situation in which it would seek back and forth.

Ofc, that can be dumped down to "parallelism", because this is inherently a question about how to parallelize... But it's also ignoring that that's what is being elaborated on: ways s3 used to enable parallelism

I dunno, the article's tl;dr is just parallelism.

Data gets split into redundant copies, and is rebalanced in response to hot spots.

Everything in this article is the obvious answer you'd expect.

You're right if you're only looking at peak sequential throughput. However, and this is the part that the author could have emphasized more, the impressive part is their strategy for dealing with disk access latency to improve random read throughput.

They shard the data as you might expect of a RAID, 5, 6, etc array and the distributed parity solves the problem of failure tolerance as you would expect and also improves bandwidth via parallelism as you describe.

The interesting part is their best strategy for sharding the data: plain-old-simple random. The decision of which disks and at which sectors to shard the data is done at random, and this creates the best change that at least one of the two copies of data can be accessed with much lower latency (~1ms instead of ~8ms).

The most crude, simple approach turns out to give them the best mileage. There's something vaguely poetic about it, an aesthetic beauty reminiscent of Euler's Identity or the solution to the Basel Problem; a very simple statement with powerful implications.

It's not really "redundant copies". It's erasure coding (ie, your data is the solution of an overdetermined system of equations).
That’s just fractional redundant copies.
And "fractional redundant copies" is way less obvious.
The fractional part isn't helping them serve data any faster. To the contrary, it actually reduces the speed from parallelism. E.g. a 5:9 scheme only achieves 1.8x throughput, whereas straight-up triple redundancy would achieve 3x.

It just saves AWS money is all, by achieving greater redundancy with less disk usage.