| First let me give some background. In the beginning there was HTTP1.0
where you send a request recieve a reply and then terminate the connection. A TCP connection required 3 round trip packets for every connection + a reques and a recieve meant that 60% of the delay was mearly getting ready to talk. Http1.1 brought pipelining where you could use the keepalive header and send req1, req2, recN and then expect to recieve reply1, reply2, replyN. The replies are expected in the order they are requested. Http2 adds a bunch of things. For one, the requests are in a binary format instead of a text in order to acheive better compression. Another thing is that it allows multiplexing. This is different from pipelining because now you can recieve replies out of order which allows small files not to get stalled out by large files. However, HAProxy will Demux the http2 requests and separate them into multiple parallel connections to the backend server where each connection supports pipelining so as not to close immediately. Each request will use a connection that’s free (ie doesnt already have a pipelined req in progess) so this is effectively the same as http2 multiplexing since HAProxy will send them back to the client in the order they are recieved from the backend (which isnt necessarily the requested order aka multiplexed) The benefit here is that 1) If you have multiple webservers, they dont have to each deal with the muxing/demuxing of streams and converting the binary to http.(might be possible to skip binary translation but code will be ugly) 2) If you have parallel connections each using threatpools to handle each request stream then you could start running into thread contention problems 3) You protect your C++ webservice with a battle tested service like HAProxy |
Remember though that because of the model, you cannot possibly serve these in parallel. You must serve them serially to be on spec.
> Each request will use a connection that’s free (ie doesnt already have a pipelined req in progess) so this is effectively the same as http2 multiplexing
It's not the same at all though, is it? HTTP/2 doesn't wait for each request to return. You could easily do exactly that same process with HTTP/2, and by decoupling the notion of "utilization" from "that connection is busy", you can actually balance to servers based on more sophisticated metrics.
> 2) If you have parallel connections each using threatpools to handle each request stream then you could start running into thread contention problems
You already have these problems with resource management for application services.
> You protect your C++ webservice with a battle tested service like HAProxy
Why does the http/2 architecture not get a load balancer but the HTTP 1.1 architecture does?