| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by scaleout1 3685 days ago
	When you suddenly realize that your "big" data is not really that big!. Who needs a Hadoop/Spark cluster when you can run one of these bad boys

2 comments

tracker1 3685 days ago

That was kind of my thought as well... I worked on a small-mid sized classifieds site (about 10-12 unique visitors a month on average) and even then the core dataset was about 8-10GB, with some log-like data hitting around 4-5GB/month. This is freakishly huge. I don't know enough about different platforms to even digest how well you can even utilize that much memory. Though it would be a first to genuinely have way more hardware than you'll likely ever need for something.

IIRC, the images for the site were closer to 7-8TB, but I don't know how typical that is for other types of sites, and caching every image on the site in memory is pretty impractical... just the same... damn.

link

sciurus 3684 days ago

I think you're missing a unit. 10-12 thousand? million?

link

tracker1 3684 days ago

million... lol

link

samstave 3685 days ago

Heh, but I wonder what the default per account limits are on launching these... prolly (1) per account.

link

cwyers 3685 days ago

Why would they put any kind of a limit on it?

link

samstave 3685 days ago

All AWS accounts have a "limits" which have the default limits as to how many instances that you could launch in that region.

The reason is so if you fuck up a scaling script for example you can't launch 1000 machines and take all the capacity and then bitch that you won't pay for it.

It's a stop gap.

However, aside from the hard limit of 100 S3 buckets, all other limits are configurable at the request of your AWS rep

link

geofft 3685 days ago

It looks like that hard limit became a soft limit in August:

https://aws.amazon.com/about-aws/whats-new/2015/08/amazon-s3...

link

samstave 3684 days ago

Hmm... I didn't know this. The last time I asked they said it would never happen. I was told the original reason was that all buckets had to have a unique name

link

admiun 3684 days ago

I thought that too but I successfully raised our limit after they made it a soft limit, so give it a shot!

link

slaman 3685 days ago

because they can only put these into racks so fast

link

nucleardog 3684 days ago

And to prevent a run-away script from suddenly spooling up thirty of them. Besides issues with their hardware capacity, they're generally pretty good about refunding mistakes like that, so they're eating the cost...

link