Hacker News new | ask | show | jobs
by scaleout1 3685 days ago
When you suddenly realize that your "big" data is not really that big!. Who needs a Hadoop/Spark cluster when you can run one of these bad boys
2 comments

That was kind of my thought as well... I worked on a small-mid sized classifieds site (about 10-12 unique visitors a month on average) and even then the core dataset was about 8-10GB, with some log-like data hitting around 4-5GB/month. This is freakishly huge. I don't know enough about different platforms to even digest how well you can even utilize that much memory. Though it would be a first to genuinely have way more hardware than you'll likely ever need for something.

IIRC, the images for the site were closer to 7-8TB, but I don't know how typical that is for other types of sites, and caching every image on the site in memory is pretty impractical... just the same... damn.

I think you're missing a unit. 10-12 thousand? million?
million... lol
Heh, but I wonder what the default per account limits are on launching these... prolly (1) per account.
Why would they put any kind of a limit on it?
All AWS accounts have a "limits" which have the default limits as to how many instances that you could launch in that region.

The reason is so if you fuck up a scaling script for example you can't launch 1000 machines and take all the capacity and then bitch that you won't pay for it.

It's a stop gap.

However, aside from the hard limit of 100 S3 buckets, all other limits are configurable at the request of your AWS rep

It looks like that hard limit became a soft limit in August:

https://aws.amazon.com/about-aws/whats-new/2015/08/amazon-s3...

Hmm... I didn't know this. The last time I asked they said it would never happen. I was told the original reason was that all buckets had to have a unique name
I thought that too but I successfully raised our limit after they made it a soft limit, so give it a shot!
because they can only put these into racks so fast
And to prevent a run-away script from suddenly spooling up thirty of them. Besides issues with their hardware capacity, they're generally pretty good about refunding mistakes like that, so they're eating the cost...