|
|
|
|
|
by arturhoo
3917 days ago
|
|
We did not have detailed enough monitoring for this dimension (membership size), and didn’t have enough capacity allocated to the metadata service to handle these much heavier requests. As much as I admire and rely on AWS' scale to build architectures and fault tolerant applications, it can't be ignored that the marketing towards going "full cloud" doesn't take into account how hard it is to build resilient architectures in the cloud. I see those disruptions events as stop signs: when the cloud itself fails to scale, I rethink a few decisions we all make when surfing those trends. http://yourdatafitsinram.com/ also comes to mind. |
|
that said, even with those disruptions and whatnot happening on Amazon as a warning, I am not skilled enough nor have time enough to build a non cloud resilient infrastructure.
I was looking to go with redundant vps at first, because amazon does have high cost for us, however, just learning all the things that can go wrong in the first very part, the load balancer, and all the gritty details one have to consider for just this little component to support interruption free failover, made me rethink the cost benefit of going managed.
it is true that going cloud doesn't really remove outages risks completely and it will not be as resilient as an infrastructure built with skill and love by the best out there, but how many shops can actually roll with their own solution and get an equivalent level of availability?
scaling web nodes is within my capabilities, building a ha database is already quite above my skill but I may manage, testing database failover, making sure it works, making sure that it can actually recover from one node dying and that the application stay live meanwhile? that's way above what I can reasonably do and what my company can afford to pay maintenance for.