| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by themitigating 1260 days ago
	"It's kinda a shame they reduced it to this. A single machine in a colo center is going to be far more reliable than single availability zone" I think that depends on the colo honestly. What is so unreliable about a single EC2 instance in a zone?

4 comments

mrkurt 1260 days ago

Faster disks, no control plane to fail, simpler network, etc.

This isn't bias speaking, I work on Fly.io, our VMs are less reliable than EC2 VMs. AWS's pitch is that all the extra complexity in their infrastructure benefits you. So is ours! But it is, in fact, extra complexity that will bite you in the ass if you don't build your apps the right way.

link

dilyevsky 1260 days ago

The fact that under the ec2‘s hood it’s a massively complex infrastructure as opposed to most colos.

link

arcturus17 1260 days ago

Yea I’m left scratching my head too. Is there really a difference in reliability between an EC2 instance and colocated hardware?

link

mrkurt 1260 days ago

Yes. In my experience, it's substantial. 200 servers in colo == maybe 1 failure every 6 months. 200 EC2 instances, one per month.

These are different things, though. If you're using AWS, you would build to account for this.

link

themitigating 1259 days ago

your personal experience has no value when discussing reliability as a whole

link

mrkurt 1256 days ago

thank you.

link

tptacek 1256 days ago

Your personal experience is no match for the awesome power of AXIOMS.

link

CuriousCosmic 1260 days ago

US-East-1 is pretty famous for being unreliable. The other zones tend to be a lot more reliable in comparison.

link

acdha 1260 days ago

There’s also a lot of selection bias: that region is the most popular and people remember hearing about problems a lot more than the people who were unaffected but didn’t say anything about it.

I’ve had plenty of instances in us-east-1 for over a decade without downtime other than the 17 minutes in 2011 where they had a network routing issue which kept the entire region running but off of the internet. I never had that with a colo - power outages & backhoes - but several came close.

For me, I’d tend to focus the question on how screwed you are if something goes down. You can save a ton of money for a bandwidth-heavy service if you use a colo so it’d really be a question of how easy it is to make it redundant (short outage) and rebuild (long outage or permanent equipment failure).

link

karmakaze 1260 days ago

It's well known that us-east-1 (the very first) is a pet among AWS's cattle regions.

It has failure modes that none of the other regions have.

link

acdha 1260 days ago

I’m aware, but my point was simply that people are prone to overstating the extent of those problems. If it was as bad as lore would have it, it’d be far less popular.

link

karmakaze 1260 days ago

Why would you think it would be less popular? Most everyone that chooses us-east-1 chooses it because they're close to it and 1 is the first number. They don't research it before they start using it.

link

acdha 1260 days ago

If people were experiencing significant downtime they’d leave us-east-1 or AWS. There’s no sign of that happening so I’d suggest that there’s a tendency to over-weight the degree to which people complaining in forums constitutes representative data.

link

themitigating 1260 days ago

"Well known"

That's not sufficient evidence

link