|
|
|
|
|
by brentonator
2285 days ago
|
|
It was a bit of both. I don't think they had the infrastructure to support everyone, I think they had major growing pains as they tried to scale, and I don't think they had the experience to identify sources of partial-failure proactively which resulted in 40-360 minute effective outages that weren't as easy to fix as with AWS where we can shift the stack to a new AZ in minutes. DO had no cross-"AZ" networking. We tried to form a tunnel but it was so unreliable they admitted they needed their own backbone to support that use-case but even when that did come out we'd still have to use public IPs...no VPC peer type thing. Our only option to switch "AZ"s was to have downtime and that wasn't acceptable. We were able to increase capacity 300-500% on AWS for the same cost so there's a lot going on in the equation for us. We needed better capability to dynamically resize and more options for low-cpu high-memory. We needed better flexibility with disks especially by network like EBS but given their network issues I understand why they don't have that. |
|