|
|
|
|
|
by psanford
854 days ago
|
|
There's a bunch of things in here that don't really make sense: > The incident was caused by a third-party caching client library that was recently integrated into our system. This client library received unprecedented load conditions caused by devices coming back online all at once. As a result of increased demand, it mixed up device ID and user ID mapping and connected some data to incorrect accounts. What? How does load on the system affect correctness? > The outage originated from our partner AWS What does this mean? Was there an AWS outage for a service they use, or was this just a normal loss of an instance? It's interesting that they blame external entities for the root causes of the incident and don't take responsibility for what is ultimately on them. |
|