I don't know how MMOs do this at all, but I would assume that for an MMO to be scalable, you do some sort of population-based geometric slicing of the world, and then assign each slice to a server such that players communicate with the server for their slice and the server for adjacent slices that are in some sort of visible/soon-to-be-visible range. That would mean no interaction between the servers - just between clients and servers. It also means that servers can be smoothly scaled out by cutting one server's area into two servers.
Edit - And if a group of players raid a dungeon, the population of that dungeon is strictly limited, so you can park that raid on one server and don't worry at all about inter-player latency.
The latency critical stuff traditionally happens in dungeons or other instances, precisely to get those players on a shared physical server. You just have a fleet of servers that each can handle X instances, and have a queue in between. And conveniently player state can just be synced in the loading time before and after the dungeon.
The bigger world is handled by slicing it up, but you still have a lot of communication going on with central databases for stuff like inventory management, chat, quests, etc. so you would probably try to keep all that within your own server racks.
Edit - And if a group of players raid a dungeon, the population of that dungeon is strictly limited, so you can park that raid on one server and don't worry at all about inter-player latency.