| HN Mirror

Depends whether looked at from the ops point of view or the end user point of view. You expressed concern about 1 million customers simultaneously having a bad experience. For a given end user if the hardware is equally reliable the odds of something happening are the same whether they are sharing with 1 million or 1 hundred thousand (or even have the server to themselves). On the ops side there is more to go wrong and failures will be more frequent but affect less end users each time.

The positive in the one big machine scenario is that you have potential to take strong efforts to keep it reliable. The advantage in the lots of machines scenario is that there is a better chance you have well tested failover solutions.

It is the combination of impact and risk that I am discussing.