We used Ubuntu 18.04 with the 4.15.0-1031-aws kernel, with sysctld overrides seen in our /etc/sysctl.d/10-dummy.conf. We used Erlang 21.2.6-1 on a 36-core c5.9xlarge instance.
To run this test, we used Stressgrid with twenty c5.xlarge generators.
100K/sec was achieved by yours truly 10 years ago on a contemporary xeon with nothing but nginx and python2.6 - gevent patched to not copy the stack, just switch it. (EDIT: and also a FIFO I/O scheduler)
They are purposely holding the connections around for 1+10%seconds. So first of all, it means that, for a rate of 100k conn/s, they are going to have around 200k open connections after a second. This already imposes a different profile than 100k single request connections per second.
You are also assuming that they need 36 cores to achieve 100k connections per second, which is likely not the case since they quickly moved the bottleneck to the OS. I am assuming they have other requirements that force them to run on such a large machine and they want to make sure they are not running into any single-core bottlenecks (and having a large amount of cores makes it much easier to spot those).
I highly doubt you were able to do 100k connections/sec 10 years ago with the same hardware, you must be confused between requests/sec and connections/sec very different things.
If you read the article, in the third or so paragraph.
> What this means, performance-wise, is that measuring requests per second gets a lot more attention than connections per second. Usually, the latter can be one or two orders of magnitude lower than the former. Correspondingly, benchmarks use long-living connections to simulate multiple requests from the same device.
Yeah, what's being discussed here are connections without any i/o over them. Just an fd lingering somewhere in an epoll pool. Which obviuosly is even less taxing. So your point is?
Nothing tells more about an engineer than the last undocumented unreproducible hello world micro benchmark conducted by her once and only once some years ago that beats a real world application in terms of req/s leaving out latency profile.
I think limiting factor might be not number of cores and outside of erl scope, that is eth card they used, network infrastructure, etc. Even Elixir could be something that impacts the tests.
Without the business logic (which was in django IIRC) and deployment details, obviously. Very outdated and some later patches might be missing. No one was interested, you see.
I'd be surprised if there were problems with network, and if there were, that should have been obvious in the metrics.
Single-request connections. Response required consulting memcached and updating it from postgres if out of luck, which was very rare but still needed (and patching then-existing postgres C client to be async aware was an undertaking)
What does that mean? You keep qualifying "connections." It's a connection. It holds onto it's connection for X period of time. An HTTP request is just a single-request connection, which is NOT what this article is discussing.
To run this test, we used Stressgrid with twenty c5.xlarge generators.