Hacker News new | ask | show | jobs
by version_five 1040 days ago

  Amazon Prime Day event resulted in an incremental 163 petabytes of EBS storage capacity allocated – generating a peak of 15.35 trillion requests and 764 petabytes of data transfer per day. 
The main thing that strikes me is how (seemingly) inefficient everything is. What do they possibly need this amount of data for in selling stuff? Are they taking high-def video of every customer as they browse for something to buy? I get that it's a huge company and this is (I guess) their business time, but how can the y need so much storage. Ditto for much of the other stuff.
8 comments

Yeah, those numbers struck me as well. At 375 million items sold, that's about 0.5GB storage and 2GB transfer per item.
10+ years ago I worked on a trading system that was generating something like 1TB/day of messaging.

As we hit these levels we asked them - how many trades are we even doing on this system? The answer was something on the order of.. 50. Granted it was a bond system and the nationals are huge, but theres just no reason to store 20GB per trade.

These are the kinds of decisions that get made when one team is responsible for message generation and the other is responsible for the storage, lol.

We then had to work backwards with them to unwind a lot of the INFO level chatty messaging between what you'd now call "microservices" and reduce the volume by 90+%.

I suppose you need to know how many requests did not result in a purchase. Is it 1000 views:purchase? I have not checked in on a Prime Day sale for several years, but is there any timeliness component (Flash Sales?) where people would be incentivized to mash the reload button?
Yes, but that's per item sold.

After looking at screen after screen of no-name garbage on Prime Day, I gave up. I suspect that there are tons of people like me. In other words, we only contributed to the numerator, not the denominator.

I think the EBS numbers are "double counting". Most of the other services in the list are using EBS under the hood, so I wouldn't be surprised if this number includes stuff like the Aurora instances, CloudTrail events, SQS events, etc that are also included.

Also, it specifically says "incremental capacity allocated", not necessarily used. Keep in mind that every EC2 instance launched also means new EBS storage is allocated. The article also estimates that 50 million EC2 instances were used for Prime Day. If you assume that half of these were newly created to support the surge of Prime Day, 25 million instances using up 160 PB of storage is only 6 gigabytes per instance, which definitely seems in the realm of possibility.

It seems to me that a lot of modern architectures store the same data in multiple places. The systems I see proposed in my company probably need often 10 times more space than the actual data we have because they copy and cache a lot of stuff.
Microservices requires denormalizing data across tables and dbs. There’s a cost to how many microservices you build.
Hot take: Amazon's search UX is so terrible that it not only wastes near-endless amounts of customer time and patience, but their own bandwidth as well.
They’ve a/b tested it to death
I wonder if Amazon has overfitted and/or a/b tester itself into a bad local optima. It’s pretty hard for me to believe that their current website really is as good as their data indicates.
IME a/b tests are often run by people with little to no knowledge of statistics or experimental procedure. It is pretty easy to end up backing bad decisions with data when you don't completely understand the data.
"Hey, check this out! User engagement as a function of time spent on amazon.com is up 125% with the new build!"

Once a metric becomes a target for optimization, it often loses its value as an indicator of a larger goal. People who obsess over A/B tests rarely understand that.

A lot of that was certainly just for the root volumes of all those ec2 instances (how much exactly is hard to know without more details). Which of course would have duplicate copies of the various base images for the VMs.

Although, that does bring up the question of why AWS doesn't have a way to share a single read-only volume across multiple ec2 instances in the same availability zone. In many workloads there isn't any need to write to disk.

There kind of is, but it's not really made for that use case so there's a bunch of caveats (it's read/write, has a limited max number of attachments, io1/2 required, can't be the boot volume): https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-volu...
Sometimes it’s just a bad decision that happens to “scaLe”. Like the print video thing.[1]

1. https://youtu.be/J7ITgYBn_3k

The EBS storage could easily be highly redundant (for good reason) local cache copies of store data.
Logging, metrics, distributed action trace.