Hacker News new | ask | show | jobs
by KirinDave 2655 days ago
This is cool, and I like it. Very Haskell like, which is a compliment in my book.

But one thing that surprises me is that folks are essentially sleeping on HTTP/2. HTTP/2 is just a hell of a lot better in most every dimension. It's better for handshake latency, it's better for bandwidth in most cases, it's better for eliminating excess SSL overhead and also, it's kinda easier to write client libraries for, because it's so much simpler (although the parallel and concurrent nature of connections will challenge a lot of programmers).

It's not bad to see a new contender in this space, but it's surprising that it isn't http/2 first. Is there a good reason for this? It's busted through 90% support on caniuse, so it's hard to make an argument that adoption holds it back.

8 comments

The reason is that HTTP/2 will be short-lived. Most of its potential (protocol multiplexing, mainly) is wasted because it still runs over TCP/IP. HTTP/3 will correct that with QUIC. I think we can except HTTP/3 to really see widespread adoption. HTTP/1.1 will remain ubiquitous though because there's a gazillion box that only speaks that.
I doubt it, unless you are expecting those that are yet to bother switching to HTTP/2, to jump directly to HTTP/3.

The migration to IP6 shows how quick the industry is moving to new protocols.

To the extent that HTTP/3 is faster/cheaper/better than its predecessors, it should allow ads to be served more cheaply. Combine that with support in the dominant browser (and why wouldn't they, after all they did basically invent it and they have a vested interest in serving ads efficiently) and I'd say HTTP/3 has a pretty good shot at success.
> To the extent that HTTP/3 is faster/cheaper/better

This is rather questionable assumption. For almost everyone in the world it's not faster/cheaper/better than HTTP/1.1, definitely not worthy enough to even bother with it. So, the only way it can get anywhere is if Google abuses its position and forces everyone to adopt it. Which they probably won't do for such a silly thing, they get more out of it by coming up with more useless mediocre "faster/cheaper/better" protocols HTTP/[4567], because this pressures competition to waste resources on that, instead of on something that can compete with Google.

One way to get around this is to use HAProxy as a middleman where it can handle HTTP2 concurrent connections on the frontend and then connect to the HTTP1.1 webservers on localhost on the backend so that you don't have to pay the big TCP connection latency.
Right, and this is "better" but if you start handling significant volume you're going to want HTTP/2 behind your proxy. Envoy does this, and it's a basically free way to just get more out of your hardware and network.
Not necessarily because let's say your backend server is multi-threaded and assigns each connection to its own thread for DoS safety. Now with HTTP2, you will make one connection to the backend server even though you are multiplexing those requests they’re still being served serially by one thread+socket. Even if you Demux and use threadhandlers you have to remux and are still limited to the single socket.

Now if HAProxy makes multiple connections to the backend each one gets served by its own thread+socket and that's going to load much faster because at the very least it's going to get more attention from the OS. Furthermore, if you use the keepAlive header, and set long timeouts, then you don’t even have to pay the connection penalty. So essentially you’ve shifted thread management to HAProxy by virtue of its parallel connections which keeps the webserver code pretty simple. And in a C++ program simplicity is key to correctness

I'm not sure I agree with most of this post. Firstly, a userland socket mux/demux implementation isn't such an insurmountable challenge. It's essentially the core of a good http/2 implementation. If you've got a good HTTP/2 implementation, you've necessarily got a good mux/demux solution.

> Now if HAProxy makes multiple connections to the backend each one gets served by its own thread+socket and that's going to load much faster because at the very least it's going to get more attention from the OS.

I'm not sure what you're basing this on. What is the technical definition of "more attention from the OS." If anything, limiting things down to a single process over one connection will improve latency. It'll can help minimize memory copies if you get volume because you'll get more than one frame per read (and you certainly aren't serving a L>4 protocol out of your NIC). Most importantly, it'll remove connection establishment and teardown costs. These aren't free.

Now, if you really are loading backends so that responsiveness is a problem, you'll need to appeal to whatever load balancing solution you have on hand. But most folks agree that this is faster.

> Furthermore, if you use the keepAlive header, and set long timeouts, then you don’t even have to pay the connection penalty.

But you still have head-of-line blocking, so you're still spamming N connections per client to get that concurrency factor. Connections aren't free, they have state and take up system resources. You'll serve more clients with one backend if you take up less resources per connection.

> So essentially you’ve shifted thread management to HAProxy by virtue of its parallel connections which keeps the webserver code pretty simple.

I don't think this is true at all. If you want to serve connections in parallel or access resources relevant to your service in parallel, you're going to need to do that. You can of course choose to NOT do this and spam a billion processes rather than use concurrency.

> And in a C++ program simplicity is key to correctness

I don't think this is unique to C++, but I'm also not sure it's really relevant here. From an application backend's perspective, it's very much the same model. They need a model that supports re-entrant request serving.

Why not just make multiple HTTP2 connections then?
First let me give some background. In the beginning there was HTTP1.0 where you send a request recieve a reply and then terminate the connection. A TCP connection required 3 round trip packets for every connection + a reques and a recieve meant that 60% of the delay was mearly getting ready to talk.

Http1.1 brought pipelining where you could use the keepalive header and send req1, req2, recN and then expect to recieve reply1, reply2, replyN. The replies are expected in the order they are requested.

Http2 adds a bunch of things. For one, the requests are in a binary format instead of a text in order to acheive better compression. Another thing is that it allows multiplexing. This is different from pipelining because now you can recieve replies out of order which allows small files not to get stalled out by large files.

However, HAProxy will Demux the http2 requests and separate them into multiple parallel connections to the backend server where each connection supports pipelining so as not to close immediately. Each request will use a connection that’s free (ie doesnt already have a pipelined req in progess) so this is effectively the same as http2 multiplexing since HAProxy will send them back to the client in the order they are recieved from the backend (which isnt necessarily the requested order aka multiplexed)

The benefit here is that

1) If you have multiple webservers, they dont have to each deal with the muxing/demuxing of streams and converting the binary to http.(might be possible to skip binary translation but code will be ugly)

2) If you have parallel connections each using threatpools to handle each request stream then you could start running into thread contention problems

3) You protect your C++ webservice with a battle tested service like HAProxy

> Http1.1 brought pipelining where you could use the keepalive header and send req1, req2, recN and then expect to recieve reply1, reply2, replyN. The replies are expected in the order they are requested.

Remember though that because of the model, you cannot possibly serve these in parallel. You must serve them serially to be on spec.

> Each request will use a connection that’s free (ie doesnt already have a pipelined req in progess) so this is effectively the same as http2 multiplexing

It's not the same at all though, is it? HTTP/2 doesn't wait for each request to return. You could easily do exactly that same process with HTTP/2, and by decoupling the notion of "utilization" from "that connection is busy", you can actually balance to servers based on more sophisticated metrics.

> 2) If you have parallel connections each using threatpools to handle each request stream then you could start running into thread contention problems

You already have these problems with resource management for application services.

> You protect your C++ webservice with a battle tested service like HAProxy

Why does the http/2 architecture not get a load balancer but the HTTP 1.1 architecture does?

I understand what you are saying. But why do you need to use http 1.1 behind the proxy for this setup? Why can't you use http 2 both behind and in front of the reverse proxy, but still demux into multiple connections at the proxy? Just because http 2 supports multiplexing, doesn't mean that you need to use it.
One possibility is that HTTP/2 is much more work to implement, and since it isn’t /absolutely/ required for base “serve a thing” functionality, it maybe be left to be implemented later.
> One possibility is that HTTP/2 is much more work to implement

If that was true, I'd be convinced by it. But is it true? HTTP 1.1 is a pretty big and complicated spec.

I think a more likely explanation is that HTTP/1.1 has been around forever (20 years!), while HTTP/2 is 4 years old, but about to be superseded.
I'm not sure that "HTTP/3 is about to overtake HTTP/2" is a very fair statement given how many years it took load balancers to start supporting HTTP/2.

HTTP/3 (QUIC) is genuinely exciting and I'm very eager to see it. But given it's even more different from http/2 than http/2 is from http 1.1, it's probably gonna be another 3-4 years before we see good load open source balancer performance in front of it.

HTTP/2 is not what HTTP/1 is, it's more like a different layer. It may as well have a different name.
> it's kinda easier to write client libraries for, because it's so much simpler

What makes it simpler? I thought it was the other way around.

It conforms better to the async model that JavaScript loves. For everyone outside the single largest deployment in history, it's simpler because it's a smaller spec.
I see your point, though I'd very much prefer if we could split the stack into application server > http server > SSL terminating reverse proxy by default. That includes HTTP/2 handling in the proxies task list. That would split concerns way more elegantly than having to replicate logic between http and https modules.
By that argument, they should get cracking on HTTP/3 already. :)

Even though HTTP/2 is usable, adoption is still low (last I checked it was around 25%), and any server that supports 2 needs to support 1 for backwards compatability. It's also more complicated to implement, so I can see why you'd want to start with HTTP/1.

HTTP/2 has support on over 90% on caniuse, which is a decent approximation of what most folks selling products here will see. The browsers that don't support it aren't the sort we really care that much about anyways unless we're building a bank or something, in which case suffering is your mandate anyways.

As for client libraries for APIs, it's a non-issue. There's plenty of flexibility on backends.

I'm not saying you'd skip "HTTP/1" but I certainly wouldn't put a lot of effort into it.

Maybe they can just skip 2 and go straight to 3! :P
the HTTP/2 part could probably be handled via a reverse proxy
You lose out bigtime between your revproxy and your backends. You might as well abandon it entirely and go with grpc streaming or thrift.

Revproxies are essentially awful, ugly lynchpins in your distributed architecture. You want to be as low as you can afford to be on utilization for them, because if they have a bad day you have no product.

You will end up with proxy anyway. Don't tell me you expose your web apps to the public without any frontend protection?
Even without thinking about protection, load balancing and/or redundancy is also of the vital to high performance application.
I didn't say you wouldn't have a proxy. I said you want to have as few as you possibly can for volume and redundancy, because they're a centralized point of failure in your architecture and they have to scale over your entire infrastructure.

I am certainly NOT advocating for their lack. I'm not even sure how that could realistically work.

You can use just Nginx as a web server right? Used to do this when learning ruby. Not for prod code mind you.