|
|
|
|
|
by gtsteve
3032 days ago
|
|
I'm not really a networking guy, so perhaps this is an obvious question, but why don't you have a failover configuration to send traffic over VPN or the public Internet? I would expect the latency to increase but otherwise still work. Is it a cost concern, is DC reliable enough that it's just an accepted risk, or is there some other reason? |
|
I should note that we have both publicly and privately reachable resources in AWS. The publicly reachable resources have fail-overs built in for situations like these (it happens automatically), but the private reachable resources with our architecture depend solely on AWS Direct Connect. For example, our Bitbucket failure today was due to the fact that we rely on AWS Direct Connect to link between the Bitbucket Cloud components that we host in our data centers and others that we host on AWS. Bitbucket could continue connecting to services in our own data centers and the public Internet/AWS, but could not talk to the privately reachable resources in the Atlassian infrastructure hosted on AWS.
We understand the importance and the impact for our customers, and dedicated several teams to this issue as soon as it was reported. AWS has resolved the issue, but we will look into ways to help prevent and better mitigate these types of issues in the future as part of our incident review and improvement processes.