Hacker News new | ask | show | jobs
by agentS 4153 days ago
A binary protocol's parsing is usually something like read 2 bytes from the wire, decode N = uint16(b[0] << 8) | uint16(b[1]), then read N bytes from the wire. A text-based protocol's parsing almost always involves a streaming parser, which is tricky to get correct, and always more inefficient.

Besides, I think this is a moot point, because chances are that less than 100 people's HTTP2 implementations will serve 99.9999% of traffic. It's not like you or I spend much of our time deep in nginx's code debugging some HTTP parsing; I think its just as unlikely we'll be doing that for HTTP2 parsing.

Also, HTTP2 will always (pretty much) be wrapped in TLS. So its not like you're going to be looking at a plain-text dump of that. You'll be using a tool and that tool author will implement a way to convert the binary framing to human-readable text.

Another way to put it is that the vast majority of HTTP streams are not examined by humans and only examined by computers. Choosing a text-based protocol just seems a way to adversely impact the performance of every single user's web-browsing.

Another another way to put it is that there is a reason that Thrift, Protocol Buffers, and other common RPC mechanisms do not use a text-based protocol. Nor do IP, TCP, or UDP, for that matter. And there's a reason that Memcached was updated with a binary protocol even though it had a perfectly serviceable text-based protocol.

1 comments

Agreed on all points. Binary protocols are no doubt better, faster, more efficient and more precise. I use reliable UDP all the time in game server/clients. Multiplayer games have to be efficient, TCP is even too slow for real-time gaming.

Binary protocols work wonderfully... when you control both endpoints, the client and the server.

When you don't control both endpoints is where interoperability breaks down. Efficiency and exactness can be enemies of interoperability at times, we currently use very forgiving systems instead of throwing them out and assert crash dump upon communication error. Network data is a form of communication.

Maybe you are right, since it is binary, only a few hundred implementations might be made and those will be made by better engineers since it is more complex. Maybe HTTP is really a lower level protocol like TCP/UDP etc now. Maybe since Google controls Chrome and the browser lead and has enough engineers to ensure all leading implementations and OSs/server libraries/webservers are correct then it may work out.

As engineers we want things to be exact, but there are always game bugs not found in testing and hidden new problems that we aren't weighing against the known current ones. Getting a something new is nice because all the old problems are gone, but there will be new problems!

It will be an all new experiment we try going away from text/MIME based to something more lower level, complex and exact over simple and interoperability focused. Let's see if the customers find any bugs in the release.

>Binary protocols work wonderfully... when you control both endpoints, the client and the server.

IP is all binary and I don't think it's a case of one party controlling all endpoints.