Hacker News new | ask | show | jobs
by diath 556 days ago
> in actual production, people prefer to operate at the container level + traffic management, and dont touch anything deeper than the container

How do you think video games like World of Warcraft or Path of Exile deploy restartless hotfixes to millions of concurrent players without killing instances? I don't think it's a matter of "prefer to", it's a matter of "can we completely disrupt the service for users and potentially lose some of the state"? Even if that disruption lasts a mere millisecond, in some context it's not acceptable.

5 comments

Most of those hot fixes are data driven as in database updates. Gameserver just reload the data, the binary itself is not touch.

I've never seen a game where they hot reload code inside the gameserver itself, it's usually a downtime or rolling updates.

> Most of those hot fixes are data driven as in database updates. Gameserver just reload the data, the binary itself is not touch.

And since the data from the disk/database (whether it's a Lua table, XML structure, JSON object, or a query) is then representend as a low-level data structure, that's essentially what hot reloading is - you deserialize the new data and hot-swap the pointers in the simplest terms.

>I've never seen a game where they hot reload code inside the gameserver itself, it's usually a downtime or rolling updates.

In World of Warcraft, you will literally have bosses despawn mid-fight and spawn again with new stats or you will see their health values update mid-fight, all without the players getting interrupted, their spell state getting desynced, or spawned items in the instance disappearing. This can be observed with the release of every single new raid on live streams as Blizzard employees are watching the world first attempts and tweaking/tuning the fights as they happen.

EDIT: Here's such an example, for the majority of the fight the extra tank could keep a spawned monster away from the boss, then mid-fight, the monster suddenly started one-shotting the tank, without the disruption of the instance, this was Blizzard's way of addressing a cheese strat to force the players to do the right as designed: https://www.youtube.com/watch?v=7gMm60BXAjU

Yes but again it's not hot swapping code as in Erlang, the C++ code is unchanged, they just change some xml somewhere.

By your definition every CRUD app have hot reloading capabilities.

> Yes but again it's not hot swapping code as in Erlang, the C++ code is unchanged, they just change some xml somewhere.

Right, not on the C++ side, but on the Lua side that WoW uses - you load the new gameplay code that pulls the new data, and override the globals with new functions.

Why does it matter the language? C++ built in the tooling to allow hot swapping, no?
C++ because 99% of the major games are built in that language.
LPMUDs ran almost entirely on hot reloadable code written in a quirky language called LPC, which later inspired the Pike language.

I believe that only the "driver" code, which handles system calls and hosts the LPC interpreter and is written in C, couldn't be hot reloaded; everything else running in the game could be reloaded without restarting the server.

I'd guess in the modern day, there would be some games where Lua scripts can be hot-reloaded like any other data, from a database or object store.

It's a rather fun language and programming environment, I'd recommend playing around with it over doing AoC.
In addition to what most people said, many other game servers just simply announce upcoming maintenance work and take the services offline until the patches are deployed.

This way they can properly test everything and rollback any potential fixes if required.. even banking systems regularly goes down for maintenance.

WoW restarts every week. Not sure that’s better than zero downtime deployments
That's just how it works when your backend is a hybrid software that utilizes a low-level compiled programming language and a high-level language that runs in its own VM. You can use the latter for gameplay features, and can hotfix on the go, and then for core changes you have to restart, which is also why WoW will hotfix the latter on the go, usually every day on an expansion launch, whereas they defer the bulk of backend changes for the next weekly restart without continuously disrupting the game for players.
That’s a very big assumption that they do code hotpatching.

It would seem far more likely they seperate the stateful (database) and stateless layers (game logic) and they just spin up a new instance of the stateless server layer behind a reverse proxy and spin down the old instance. It’s basically how all websites update without down time.

A website that just proxies to another server does not need to do much to restore the previous state to make it look seamless to a user, the client will just perform another GET request that triggers a few SELECT queries, it's far more complex in the context of a video game.
Games do in fact have downtimes on major releases and you have to restart the client too before connecting.
For major patches/backend changes that require recompiling - yes, for gameplay tweaks/hotfixes - no, hot reloading is preferable where possible.