Hacker News new | ask | show | jobs
by je42 3397 days ago
wondering why they have the edge nodes and not just a load balancer. looking at their responsibilities it looks like a lot of the overlap with the responsibilities of a load balancer.
1 comments

Looks like a load issue, a single server can only take care of so many conn, auth, etc... So having several solves the problem and can scale horizontally easily, but each edge maintains conn information, so a given user needs to be routed to that same server every time. On the other hand, all LBs share a single address and all should be replaceable by each other, they exist purely to route the user to the right edge server (existing conn or least busy).

Question to OP: do you run your edge servers in pairs or some kind of cluster?

Not the OP, but I work on the same team.

The edge servers are not clustered and share no state. We require at least 2 servers as minimum for fault tolerance.

Did the need for edge server arise from the fact the a service like https://aws.amazon.com/elasticloadbalancing/applicationloadb... didn't exist back then ?
Well we need edge servers to handle the persistent websocket connections which last through the life the player session.
yeah the ALB provides that feature as well. I just checked. it was released in Aug 2016. So clearly before that time your setup makes a lot of sense.

I am wondering if the ALB would be the preferred method now, if you were to redo it ?

Right, ALB were introduced after we built RMS. We would have to re-evaluate the ALB stability/cost/scalability - but definitely something to consider.
Thanks for replying!

So what happens to a player session when an edge server dies?

Do you have a way to rehydrate the session on a different server?

When the edge server dies player will reconnect to another node (load balancer will select a healthy one this time). In the same time RMS will detect that the session got lost in an abrupt way and will buffer any outgoing messages addressed to that session for a short time, just in case when player reconnects.