|
|
|
|
|
by kjw
1607 days ago
|
|
I would not have guessed Roblox was on-prem with such little redundancy. Later in the post, they address the obvious “why not public cloud question”? They argue that running their own hardware gives them advantages to cost and performance. But those seem irrelevant if usage and revenue go to zero when you can’t keep a service up. It will be interesting to see how well this architecural decision ages if they keep scaling to their ambitions. I wonder about their ability to recruit the level of talent required to run a service at this scale. |
|
Playing "what-if" thought experiments is fun, but when the rubber hits the road, you often find that things that are stable for 99.99%+ of load patterns encounter previously unforeseen problems once you get into that far-right-hand side of the scale. And it's not like we've completely mastered squeezing performance out of huge CPU core counts on NUMA architectures while avoiding bottlenecking on critical sections in software. This shit is hard, man.