For Broadcast, we can scale by adding additional nodes to the cluster but up to a point. The current architecture works because every node is aware of all other nodes in the cluster, but this strategy becomes untenable after a certain point. The good news is that we're nowhere near that point and we'll optimize and develop other strategies to circumvent this eventual limitation. We'll also need to carefully manage and process the inflow and outflow of messages on each node to make sure we're not delaying message delivery or using more memory than absolutely necessary.
For Presence, it syncs the joins and leaves from node to node every 1.5 seconds (default) and there will be a point where this work will take longer than that interval. Fortunately, the underlying ORSWOT CRDT implementation can be refactored to make syncing more efficient.
That sounds like you might take good advantage of machines with more than one network card, send the local updates via broadcast to all the machines in a cluster via one network and do the internet side via the other.
I'm not the only one who's noticed that the development history of Eve Online looks a good deal like someone Reinventing Erlang Badly. But I suspect one could mine their developer blogs, especially the older stuff, for ideas on sharding and de/mux tricks. A lot of functionality they had to sweat is probably relatively straightforward/slightly clever genserver logic.
What I recall is that there were several layers to handle receiving actions from users and broadcasting state to other users, others to represent entities and a few more to represent locations.
Like WoW in the early and middle years, the Open World feel is a bit of a fiction. WoW started with sharding akin to availability zones, but at ten million subscribers there’s only so much sharding you can do before people can’t meet each other. There were artificial choke points designed into both games that could be leveraged to offload some of the most intense interactions to separate hardware. Later WoW introduced phasing, which turned the map into a 4 dimensional space where actions were not observed by all observers (and map state could vary based on quest progress). The grouping system at that point had more situations where it could match you up with anyone in the same region not just the same availability zone, with or without moving you to a new hardware to do so.
Everyone noted how brilliant it was going to be for scaling the system up, but I’ve found that the best scaling designs are bidirectional. Both of these games made a system where they could respond to post-content-release flash crowds, and also power more hardware down. The other thing Blizzard did that in a non-cloud world I found particularly clever: before a major content release they would upgrade hardware, migrate users to the new machines, use the old hardware for load shedding tasks, release the content, wait for traffic to subside, then decommission old hardware to get back to steady state. You get to test the new hardware before the Big Show, and rely on burned in hardware to get you through to the other side. Very good operational intelligence IMO.
In Blizzard’s case some of the scaling tricks I mentioned earlier also improved the problem of users getting orphaned by declining subscriptions being spread out between too many AZs. In a pinch when traffic was low, you had access to the entire region (though I can’t say if that was controlled algorithmically or not). Previously the best they could do was a sort of traffic shaping by offering to move people from one server to another for free, where usually they charged a fee for doing so. That could take months to achieve load shedding. My suspicion is that at some points they had pools of hardware that belonged to the region and was parceled out between zones based on load. That allows you to absorb flash mobs better because those will be isolated to one zone instead of dependent on game state or time of day (after work on Friday everywhere with a low ping rate to that region, for instance).
For Broadcast, we can scale by adding additional nodes to the cluster but up to a point. The current architecture works because every node is aware of all other nodes in the cluster, but this strategy becomes untenable after a certain point. The good news is that we're nowhere near that point and we'll optimize and develop other strategies to circumvent this eventual limitation. We'll also need to carefully manage and process the inflow and outflow of messages on each node to make sure we're not delaying message delivery or using more memory than absolutely necessary.
For Presence, it syncs the joins and leaves from node to node every 1.5 seconds (default) and there will be a point where this work will take longer than that interval. Fortunately, the underlying ORSWOT CRDT implementation can be refactored to make syncing more efficient.