Hacker News new | ask | show | jobs
by jayunit 1877 days ago
I'm curious about how race conditions would be handled when multiple users, on different regional LiveView servers, take conflicting actions.

In the "Let's walk it through" section, it seems like the Player-to-LiveView connection will process user input (e.g. a Tic-Tac-Toe move) and update the UI to acknowledge this, at which point the user can be assured that the LiveView server accepted their input. But it seems like this happens before the GameServer has also accepted the input. What if Player 2 made a conflicting play and their change was accepted by the GameServer before Player 1's change reached the GameServer?

Given, in Tic-Tac-Toe, the game is simple enough that this is neatly avoided: each regional LiveView server has enough information to only allow the current player to make a play. But in more complex applications, how might you (anyone; curious for discussion) handle this?

One answer is something like: The LiveView server is effectively producing optimistic updates, and the GameServer would need to produce an authoritative ordering of events and tell the various LiveServers which of the optimistic updates lost a race and should be backed out.

5 comments

> What if Player 2 made a conflicting play and their change was accepted by the GameServer before Player 1's change reached the GameServer?

Not sure I understand the question but, I don't see how this would happen.

On the BEAM it's processes all the way down. There's a process for that instance of the game, which is basically a big state machine, and 2 processes representing the client state, one for each player.

When the game (process) starts, it expects a message from player 1 (process), then one from player 2, and so on.

If there's a client timeout or network disconnection, the player process affected crashes, and if the app has been architected well, the other player process and game process are in a supervision tree, so they crash as well, perhaps notifying the other player that the game has ended because of a disconnection from the other peer.

But none of this will accept a move from the player 2 when it's player 1's turn.

Thanks for your reply!

This is very interesting - I'm pretty unfamiliar with BEAM. Does this "processes all the way down" span across machines/VMs?

From the article, it seemed like there could be two players, each connecting to different LiveServer instances (on different VMs/hardware in different geographic regions) which in turn communicate async via one central GameServer.

In the article, it seems like a message from Player 1 to LiveServer 1 doesn't need to wait for the message to also reach the central GameServer and be acknowledged before LiveServer 1 acks the change back to Player 1. This seems to allow races, since the central GameServer is the source of truth but the Player1/LiveServer1 communication can complete a message/ack round-trip without waiting for acknowledgement from the GameServer.

I guess an alternative would be for the system to require a message from Player 1 to be passed to LiveServer 1, then passed on to the central Game Server which acks back to LiveServer 1, which finally can ack back to Player 1 -- this means that Player 1 would still need to pay full round-trip latency to LS1 and then to the GameServer for any action.

Thanks for any light you can shed on this!

Here's the relevant part from the article:

> The browser click triggers an event in the player's LiveView. There is a bi-directional websocket connection from the browser to LiveView.

> The LiveView process sends a message to the game server for the player's move.

> The GameServer uses Phoenix.PubSub to publish the updated state of game ABCD.

> The player's LiveView is subscribed to notifications for any updates to game ABCD. The LiveView receives the new game state. This automatically triggers LiveView to re-render the game immediately pushing the UI changes out to the player's browser.

So you can see that when Player 1 does an action, the action is sent to the GameServer. Player 1's UI is only updated when the GameServer has published the new game state via PubSub back to Player 1's LiveView process, that pushes it onto the client. So there is the latency of going from client to LV to GameServer and back again, but there is no race possibility.

> I'm pretty unfamiliar with BEAM. Does this "processes all the way down" span across machines/VMs?

yes for example if u had a process named on a different Machine(Node called in Erlang) called "Alice", u could from a different Node send it a message using the Node Identifier as additional parameter example:

[coolest_node | _rest_of_nodes] = Node.list()

Process.send({Alice, coolest_node }, :hi)

Ah, that's indeed a good question, though those are implementation details of Fly.io I'm unaware of.
You could use CRDT for more complex games which are not turn based.

https://moosecode.nl/blog/how_deltacrdt_can_help_write_distr...

It is turn based and sync is easy, whatever you do unless you are the only one in turn you can be safely ignored. Once move to non-turn based ...
It's the same as with your mobile phone when you lost your wi-fi signal. Everything pauses and everybody has to wait.

Have you played games like HOMAM 1 or 2? You can't do anything when the CPU is playing its players. You can watch where he goes and what he does but that's it. When he is finished then you go.

When there is a network error - some message or please wait... or loading spinner message should be shown in the meantime.

For turn based RPG games or Chess etc. this is a non issue.

Of course, real time action games etc. is not a good idea for this technology.

Your answer is pretty close to what most people do: https://en.m.wikipedia.org/wiki/Client-side_prediction
There's no need for client side prediction or optimistic UI on (most) Live View projects.

It's all done on the server.

Latency is the reason. Even in a turn based game it still feels really bad to make a move and have to wait for it to make its way through the round trip before seeing the result. In a game with strict ordering like Tic Tac Toe there is little reason not to show the chosen move immediately.
Sure, that's why I meant most use cases.

I mean, 100 ms between a click and a cross appearing on screen is not great user experience, but it's not even the worst. If you're writing a game, a little client side prediction is a good idea.

But if you have a form with instant validation, or any old regular UI, that is not necessary at all. The only built in optimistic UI functionality on Live View is disabling a button when you press it and wait for the server to respond, to avoid double submissions.

> But if you have a form with instant validation, or any old regular UI, that is not necessary at all.

Arguably because you're trusting the client and essentially the built or built-in behavior is therefore optimistic by default. Then hopefully validating on submission server-side.

From the tech talks I vaguely recall, LiveView folks seem to disregard latency, which is where the entire model falls apart for me because the moment you need more control on the client over what to do when the server is not responding - you’re entirely out of luck.

Though maybe I’m wrong and there has been some new developments to address this, I wasn’t following too closely.

On the contrary, LiveView documentation acknowledges this and suggests to handle such scenarios using client side tools:

There are also use cases which are a bad fit for LiveView: Animations - animations, menus, and general UI events that do not need the server in the first place are a bad fit for LiveView. Those can be achieved without LiveView in multiple ways, such as with CSS and CSS transitions, using LiveView hooks, or even integrating with UI toolkits designed for this purpose, such as Bootstrap, Alpine.JS, and similar.

https://hexdocs.pm/phoenix_live_view/Phoenix.LiveView.html#m...

Sorry, that’s not acknowledging that latency can become an issue, that’s acknowledging that using server-side rendering for things that don’t require a server isn’t the best of ideas (shocker, I know).
Possibly many LiveView tech demos / projects by the community haven't had much thought into latency, but LiveView itself even contains a latency simulator[0] built-in. Additionally, it can toggle classes on elements when you click them and turn them off again when an acknowledgment has been received from the backend [1]. Finally you have the JS hooks, through which you can just implement any kind of loading indication you want on the frontend. So the tools are there, they just need to be used.

[0] https://hexdocs.pm/phoenix_live_view/js-interop.html#simulat...

[1] https://hexdocs.pm/phoenix_live_view/js-interop.html#loading...

One trick I remember using (~two years ago, so early LV) when handling click events was to put everything async/not needed to reply in a spawn() function.

But yes as soon as you're on the internet you'll often feel the delay if your app is interactive.

The problem is that it's a bit random, because the network and the VM performances are never totally linear.

I remember implementing a countdown (using 1s send_after()) that would work fine most of the time, but sometimes there would be some hiccup and the countdown would stall just a bit and then process the counter in an accelerated fashion, which was terrible from a UI point of view, so in the end I did it in JS except for the update once the end reached.