Hacker News new | ask | show | jobs
by dayjah 3640 days ago
Hi there, I'm one of the original engineers who worked on our re-implementation of chat which ended up in Go.

We've a culture of being willing to try new things at Twitch. When our twisted-python chat system no longer met our needs of being easy to iterate on we decided to rebuild it; it was a monolith and we decided to chunk it up to reflect needs of our users and the pace at which we could develop new features. Notably we wanted to no recycle TCP connections whenever a new feature was added (which was a short coming of the twisted-python solution - along with a bunch of global state that was becoming hard to reason about). As part of this re-work we had a pub-sub portion which was super simple and we decided to try this new exciting language with a lot of promise out on it - it worked amazingly well. Over the course of another year or so we ended up rebuilding all of the components in Go.

When we first evaluated rebuilding chat we assessed a few options:

- python

- nodejs (we started with this, but random crashes and poor tooling at the time didn't work for us)

- erlang (notably could we use ejabberd as the hub of the system)

Ultimately we chose python because we knew python and we needed this to work right now. The move to go happened incrementally thereafter and was driven by:

- increase in trust

- great tooling

None of this can be pitched as "Go vs X", it is purely a tools and expediency orientated set of decisions.

2 comments

> Notably we wanted to no recycle TCP connections whenever a new feature was added

So with the Go server, you're able to redeploy without closing open connections? Do you just run multiple versions in parallel and load balance over to the new version once connections close, or something else?

There are actually two (or more) different services. One that sits and talks to the users via TCP and maintains the IRC connection state and then makes back end calls to the bit that makes decisions and publishes information.

This allows us to almost never deploy changes to the first service, while frequently making changes to the second system. Of course when you do want to make changes to the first you have to reestablish all the TCP connections again, but if you engineer it correctly you can do it infrequently enough to be worthwhile.

Disclaimer: I don't actually work on the chat team, this is based upon various conversations with people on the chat team and may be incorrect in some specifics or out of date.

Yes, Dobbs captures this here. To be clear, the first re-write of the chat service was from twisted python into tornado python. In that re-write we produced a number of services which implemented the biz logic. One of those services was a TCP terminating edge server which has very little logic in it beside how to call the biz logic and send messages to connected clients. Once this was all written we converted to Go incrementally.
> We've a culture of being willing to try new things at Twitch

How is that different from NIH syndrome?

How about if you have a culture of being willing to try new things not invented here? That would be quite different from NIH syndrome.
being willing to try new things != trying things because they're new