The secret is keeping performance in mind throughout your whole tech stack and in your application code. Do that, and you might not even need to scale beyond a couple machines for redundancy, depending on your SLOs.
I’ve said this a couple times before, but it bears repeating - nginx can easily serve a million 1 kB static files per second over TLS from just one machine with a modern Xeon/EPYC CPU. Serving dynamic content doesn’t have to perform any worse than one or two orders of magnitude below that.
I’ve said this a couple times before, but it bears repeating - nginx can easily serve a million 1 kB static files per second over TLS from just one machine with a modern Xeon/EPYC CPU. Serving dynamic content doesn’t have to perform any worse than one or two orders of magnitude below that.