|
|
|
|
|
by tw04
1964 days ago
|
|
Per the article, literally nothing in their code would have solved the issue. AWS was supposed to auto-scale TGWs and didn't. >Our own serving systems scale quickly to meet these kinds of peaks in demand (and have always done so successfully after the holidays in previous years). However, our TGWs did not scale fast enough. During the incident, AWS engineers were alerted to our packet drops by their own internal monitoring, and increased our TGW capacity manually. By 10:40am PST that change had rolled out across all Availability Zones and our network returned to normal, as did our error rates and latency. |
|