|
|
|
|
|
by Twirrim
1002 days ago
|
|
I remember the regionalisation, that was "fun" to be on the sidelines for (I was in a newer service that was regionalised from the get-go). I don't remember who the PM was for that one, but I remember that being when I truly came to respect the value that a TPM can add. You're right about the cost and need to replace network equipment being one of the strong reasons why they didn't. Amazon used its own in-house designed and built network gear for a variety of reasons (IIRC there's a re:invent talk about it), which I'm sure is probably still the case.
Every single one of those machines had fixed memory capacity and would need to be replaced to bump up the memory sufficiently large enough to handle IPv6 routing table needs etc. What they had wouldn't even be enough if they'd have chosen to go IPv6 Only (which you couldn't get through except via dual stack IPv4/IPv6 anyway). |
|
I'm not privy to details, but I recall once when a mandate was issued to a Java platform to remove an outdated encryption protocol (mandated by Amazon Infosec). The change was made and rolled out with little fanfare.
A few weeks later, a large outage of Amazon Video (which used said platform) occurred on a Friday evening. Root cause? The network hardware accelerators were only setup to use that outdated protocol, which in turn meant that encryption was happening in software instead. Under load, the video hosting eventually caved.
Might be specific to the hardware used for Amazon retail, but it reinforces the point of their home grown (and now aging) stack.