Hacker News new | ask | show | jobs
by sillysaurusx 1734 days ago
It’s telling that every answer is “just deploy servers near your users.”

One of YouTube’s most pivotal moments was when they saw their latency skyrocketed. They couldn’t figure out why.

Until someone realized it was because their users, for the first time, were world wide. The Brazilians were causing their latency charts to go from a nice <300ms average to >1.5s average. Yet obviously that was a great thing, because of Brazilians want your product so badly they’re willing to wait 1.5s every click, you’re probably on to something.

Mark my words: if elixir takes off, someday someone is going to write the equivalent of how gamedevs solve this problem: client side logic to extrapolate instantaneous changes + server side rollback if the client gets out of sync.

Or they won’t, and everyone will just assume 50ms is all you need. :)

4 comments

> It’s telling that every answer is “just deploy servers near your users.”

This isn't the takeaway at all. The takeaway is we can match or beat SPAs that necessarily have to talk to the server anyway, which covers a massive class of applications. You'd deploy your SPA driven app close to users for the same reason you'd deploy your LiveView application, or your assets – reducing the speed of light distance provides better UX. It's just that most platforms outside of Elixir have no distribution story, so being close to users involves way more operation and code level concerns and becomes a non-starter. Deploying LiveView close to users is like deploying your game server closes to users – we have real, actual running code for that user so we can do all kinds of interesting things being near to them.

The way we write applications lends itself to being close to users.

Imagine how painful HN would be if you upvoted someone and didn’t see the arrow vanish till the server responded. Instead of knowing instantly whether you missed the button, you’d end up habitually tapping it twice. (Better to do that than to wait and go “hmm, did I hit the button? Oh wait, my train is going through a tunnel…)

Imagine how painful typing would be if you had to wait after each keypress till the server acknowledged it. Everyone’s had the experience of being SSH’ed into a mostly-frozen server; good luck typing on a phone keyboard instead of a real keyboard without typo’ing your buffered keys.

The point is, there are many application-specific areas that client side prediction is necessary. Taking a hardline stance of “just deploy closer servers” will only handicap elixir in the long run.

Why not tackle the problem head-on? Unreal Engine might be worth studying here: https://docs.unrealengine.com/udk/Three/NetworkingOverview.h...

One could imagine a “client eval” code block in elixir which only executes on the client, and which contains all the prediction logic.

You'd use the optimistic UI features that LiveView ships with out of the box to handle the arrow click, and you wouldn't await a server round-trip for each keypress, so again that's now how LiveView form input works. For posterity, I linked another blog where I talk exactly about these kinds of things, including optimistic UI and "controlled inputs" for the keyboard scenario: https://dockyard.com/blog/2020/12/21/optimizing-user-experie...

While we can draw parallels to game servers being near users, I don't think it makes sense for us to argue that LiveView should take the same architecture as an FPS :)

> Deploying LiveView close to users is like deploying your game server closes to users – we have real, actual running code for that user so we can do all kinds of interesting things being near to them.

Then why do you start running forward instantly when you press “W” in counterstrike or quake? Why not just deploy servers closer to users?

Gamedev and webdev are more closely related than they seem. Now that webdev is getting closer, it might be good to take advantage of gamedev’s prior art in this domain.

There’s a reason us gamedevs go through the trouble. That pesky speed of light isn’t instant. Pressing “w” (or tapping a button) isn’t instant either, but it may as well be.

> Then why do you start running forward instantly when you press “W” in counterstrike or quake? Why not just deploy servers closer to users?

You do both? Game client handles movements and writes game state changes to a server, which should be close to the user to reduce the possibility for invalid state behaviors? You really haven't seen online games that deploy servers all over the world to reduce latency for their users? What?

Both web apps and games do optimistic server writes. Both web apps and games have to accommodate a failed write. Both web apps and games handle local state and remote state differently.

I read his post as a criticism of how little optimistic updating is done in web apps, and how bad the user story is. Why can't it be easy to build every app as a collaborative editing tool without writing your own OT or CRDT?
Because an occasional glitch when the client & server sync back up is acceptable in a game. Finding out that my order didn't actually go through is much worse. Especially since click button, see success, and close browser is an relatively common use case.
Consider these two scenarios.

1. SPA with asynchronous server communication. A button switches to a spinner the moment you click it, and spins until the update is safe at the server. Error messages can show up near the button, or in a toast.

2. LiveView where updates go via the server. The button shows no change (after recovering from submit "bounce" animation) until a response from the server has come back to you. To do anything better, you need to write it yourself, and now you're back in SPA world again.

There's a reason textarea input isn't sent to a server with the server updating the visible contents! Same thing applies to all aspects of the UX.

EDIT: https://dockyard.com/blogs/optimizing-user-experience-with-l... talks about this. That'll handle things like buttons being disabled while a request is in flight, but it won't e.g. immediately add new TODO items to the classic TODO list example.

That's a deliberate UI choice, though, and it doesn't always make sense in non-transactional workflows. It's easy to wait for Google Docs to say "Saved to Drive", and going to a new page to save a document would be really disruptive to your workflow, for example.
I remember this story but can't find it anywhere. If I recall correctly they deployed a fix that decreased the payload size. However, in doing so they actually opened the door to users with slow connections that were unable to use it at all before. So measured latency actually went up instead of down.
That’s the one! Where the heck is it? It’s one of my all time favorite stories, but it seems impossible to find; thanks for the details.
YES! Thank you! I’ve seriously been searching for like five decades. What was the magical search phrase? “YouTube Brazil increase latency” came back with “How YouTube radicalized Brazil” and other such stories. (Turns out the article mentions “South America” rather than “Brazil”; guess my Dota instincts kicked in.)

Anyway, you rock. :)

Thank you! It was impossible to find anything on Google since any variant of "youtube", "latency" etc showed results for problems with YouTube or actual YouTube videos talking about latency.

The trick was to use HN search: "youtube latency" and select Comments. First result was a comment on https://www.forrestthewoods.com/blog/my_favorite_paradox/ which links the story in the "Bonus!" section.

> Mark my words: if elixir takes off, someday someone is going to write the equivalent of how gamedevs solve this problem: client side logic to extrapolate instantaneous changes + server side rollback if the client gets out of sync.

most games have the benefit that they're modeling the mechanics of physical objects moving around in the world and are having their users express their intentions through spatial movement. the first gives a pretty healthy prior in terms of modeling movement when data drops out and the latter can be fairly repetitive and thereby learnable and predictable.

whether or not user interaction behaviors can be learned within the context of driving web applications seems a little less clear, to me at least. it does seem like there are a lot more degrees of freedom.

Nothing so complicated. All that's needed is a local cache so that when you type a new message in the chat window, you immediately see it appear when you hit submit (optionally with an indication of when the message was received by the peer). But there's quite a bit of tooling required to reliably update the local cache, run the code both in the client and on the server.
Firebase does this brilliantly with Firestore queries. Any data mutation by the client shows up in persistent searches immediately, flagged as tentative until server acknowledges.
> server side rollback

server controlled client side rollback, you mean?