| Two main solutions (given that the article mentions that they have huge machines already):- A) Allow a single solar system to span multiple machines. Very hard, especially if the server software isn't architected for this. Retrofitting this can be nigh on impossible. B) Have a few huge machines that can be used to host scenarios like this and, more importantly, have a way of migrating users over to the huge machine seemlessly. The latter can be done but it's tricky, especially if transferring game state between instances of the server is not simple (I'm not talking about transferring the VM itself with something like vMotion). It comes down to:- 1) Being able to make the bigger machine act as a temporary proxy pushing connections data back to the smaller machine. 2) Having a way of telling clients to make a new connection to the bigger machine and, once that connection is made (and the data is being proxied to the smaller machine) cut the connection to the smaller machine. Users see no loss of service or reconnects at all. 3) Once all clients are now being proxied by the bigger machine; pause and transfer the game state from the smaller machine to the big machine and then continue. Obviously it works best if a chunk of state can be transferred in the background and then the final transfer (and pause) is as short as possible in order to transfer over the bang up to the minute state. Option (A) is always the proverbial "In v2 of the server we'll do it a completely different way..." |
Yet players have a tendency to figure out when places are too overcrowded to be fun. So your old problematic load is almost never representative of how many players wanted to be in that area, but merely how many players were willing to put up with that level of degraded performance.
So upon release (or sufficiently close to it to start stress testing, which is conveniently when it's too late to really change architecture) the new limits are quickly hit.