|
|
|
|
|
by hintymad
2033 days ago
|
|
AWS frontend services are usually implemented in Java. If Kinesis' frontend does too, then it's surprising that the threads created by a frontend service would exceed the OS limit. This tells three possibilities: 1. Kinesis did not impose a max thread count in their app, which is a gross omission; 2. Or there was a resource leak in their code. 3. Each of their frontend instances stored all the placement information of backend servers, which means their frontend was not scalable by backend size. |
|
Assuming they have say, 5000 front end instances, thats 5000 file descriptors being used just for this, before you are even talking about whatever threads the application needs.
It’s not surprising that they bumped into ulimits, though as part of OS provisioning, you typically have those tuned for workload.
More concerning is the 5000 x 5000 amount of open tcp sessions across their network to support this architecture. This has to be a lot of fun on any stateful firewall it might cross.