Hacker News new | ask | show | jobs
by monstermonster 4269 days ago
I don't get this.

TCP has steams. TCP has connection mux. TCP has flow and congestion control. HTTP has keepalive. Why build another stack on OSI layer 7?

Also now we have to keep state to work out what the diffs are. State is evil.

Whilst I'm sure this will have some minor performance advancements, I'm not sure that it justifies the new protocol stack.

Not sending 2Mb of JavaScript and crappy HTML down the connection to display the front page probably has higher gains.

3 comments

You'll need to call a meeting of all the internet's firewall administrators who block TCP ports by default but allow 80 and 443 through. If you can get them to agree to stop breaking the internet then we can use TCP. Until then we will need to build a new internet on top of HTTP, inside encryption so they can't meddle with it.
I don't understand your point.

80 and 443 are "well known ports"[1] which is fine.

What does this have to do with ports? TCP is connection based so a client can create as many connections as it likes to a port on a host.

If someone does indeed build a new "internet" built on top of HTTP which is tunnelled through well known ports with different services with the intention of circumventing the firewall then they will not be allowed through my firewall at all.

[1] https://www.ietf.org/rfc/rfc1700.txt

The problem is that opening new connections is horribly inefficient. The 2-3 round trips (TCP + SSL) required to set up a new connection and the ensuing slow start phase significantly delay the request, and thus the response. It is much more performant to use a single, well-utilized connection in congestion avoidance. The only way to avoid the ugliness of repeated flow control is to build on top of UDP (see QUIC), but there are practical issues with network connectivity and firewalls there.

EDIT: why do you want to block HTTP/2 by the way? You know that HTTP/1.1 can be used to tunnel other protocols too, right?

That's not horribly inefficient. That's the cost of doing business with HTTP.

In fact you're going to have to go to the same effort to establish a TCP connection that your HTTP/2 is going to run over, then still have to do a key exchange. That channel then has the same advantages of a persistent HTTP/1.1 channel plus the ability to provide multiple streams.

The multiple streams can be resolved simply by making more than one connection to the server defensively. Perhaps a mechanism to schedule that client-side would work. Oh wait, we already have one (connection limits and keep-alive).

Then again, all of this is moot as once you've loaded the static resources (images/css/js etc) via HTTP, you should only be seeing one request periodically when an operation takes place or at an interval if polling or kept alive for server-push so maximum two connections from a client to a server.

If you need to do anything more than that, you're probably using the wrong technology both on the server and client.

HTTP/1.1 tunnelling I understand. In fact I use it most of the day (RDP over terminal services gateway) which is RPC over HTTP.

The rationale I have is that effectively managing HTTP/2.0 at the firewall requires packet and protocol inspection rather than merely understanding what connections have been made and where from and where to. This has a significant complexity and tooling cost and effort. Plus there is a significant opportunity to mask illegitimate traffic as legitimate traffic. For those of us who deal with end user network security, this is a major problem.

That's not what HTTP/2 does. If what you want is to tunnel past a firewall, establish an SSL connection and run any number of existing VPN protocols over it, and you can continue to run TCP/IP just fine.
You can run TCP inside a TCP connection, but your bandwidth throttling gets a bit strange. Your inner TCP sees delays instead of packet losses and that isn't how TCP is built to throttle.

As a practical matter, the percentage of customers who will put up a VPN to use your service is vanishingly small.

Trust me. The only thing which will come from this is even more broken firewalls, more complex software, a gazillion of new category of bugs and more vulnerabilities than currently contained in all of a phps combined codebase.

And then we'll be back at square one, ready to make this mistake all over again.

All true but TCP has head-of-line blocking, which means even if resources are requested in parallel then can only be returned in the order they where requested.

In an ideal world we could switch to using something like the SCTP networking protocol with HTTP that would solve a lot of issues. Unfortunately we are stuck with TCP, so the application protocol (HTTP) now must implement a networking protocol so we can multiplex over a single connection.

At least people won't have to inline resources, sprite images, or concatenate CSS and JavaScript anymore. And header compression is a small upgrade to the spec.

Couple of follow ups on this one:

SCTP is message oriented rather than stream oriented so this isn't really useful. The chunk size is also two bytes meaning that all your messages have to be less than 64k or you have to implement packet reassembly and stuff. Oh look, back at TCP again.

We must do nothing.

I suspect this entire SPDY/HTTP/2 reengineering effort is a 1000% complexity and risk increase for a 2-5% gain in performance. That is not a trade-off as an engineer I could accept.

90% of the inefficiency of web applications is down to the application stack, not the protocols. Sending hundreds of KiB of uncompressed text down rather than compressed abstract or native virtual machine instructions for example is a bigger win.

Boy, it was dumb of them to put stream in the name (Stream Control Transmission Protocol) if it wasn't capable of acting in a streaming manner.

Oh wait, SCTP can act in an ordered-with-congestion-control mode (aka stream-oriented), and the userland interface to it (the most basic form of which is just plain old Berkeley sockets) does in fact implement packet assembly (of course, no matter what, if you want packets bigger than the MTU something's gonna have to disassemble and reassemble them on some level of the stack anyways).

Not to say that SCTP is a practical solution given the glacial pace of acceptance of any new network protocol at its level, but let's not start spreading FUD about its capabilities.

Yes, network protocols like IPv6 have a glacial deployment speed. Because all the network equipment have to support it.

But it isn't so for transport protocols like SCTP. Only the endpoints using it need to support it. So a transport protocol that provides a real benefit could be deployed relatively quickly.

I am not trying to extol any virtues or negatives of SCTP, just comment that for HTTP/2 to have multiplexing over a single connect without the head-of-line block problem they have to implement messages also. Seems wasteful.

    TCP != HTTP
TCP is a transport layer protocol (OSI Layer 4). HTTP is an application layer protocol (OSI Layer 7).

https://en.m.wikipedia.org/wiki/OSI_model

That's my point.

HTTP/2 is implementing TCP's responsibilities. Again. Badly.

The same criticism with equivalent merit could be leveled against TCP because it lives as far the Physical Layer (OSI Layer 1) as HTTP lives from the transport layer...e.g. "TCP is implementing Ethernet's responibilities."

But of course, though we rarely worry about Token Ring these days, we do run TCP over IEEE 802.11 all the time.

Likewise, HTTP is run over other Transport Layer protocols even if it is less common, e.g. UPnP uses HTTP over the UDP transport layer protocol. http://en.wikipedia.org/wiki/Universal_Plug_and_Play#Protoco...

OSI's higher levels are abstractions. As is the case with all useful abstractions, they serve to implement the functionality of lower levels without requiring attention to their actual implementation. Not having to manage TCP allows a lot of useful JavaScript to be easily written.

That's a disingenuous and self contradictory description of how the OSI stack works.

There are upwards guarantees at each layer that the stack makes. All implementations within the layer must be equal to the next layer even if one of the implementations provides capability of higher layers. Nothing is said however about adding further guarantees in layers 8, 9, 10, 11, 12...and so forth because they have already been made.

I suppose I shouldn't use parity bits on serial connections then?

"Not having to manage TCP allows a lot of useful JavaScript to be easily written"

That's absurd. It makes no difference.

As for UPnP, which I know well having written an entire UPnP stack, it's a broadcast messaging layer, not a connection based protocol. All the HTTP messages stay within the size of a UDP datagram and it is expected to be wholly unreliable. Even though it's ugly, it's hardly a comparison.

I get the feeling a lot of people here are web developers with little experience of protocol stacks and not system programmers!

I suppose I shouldn't use parity bits on serial connections then?

If you're writing at the level of serial connections and parity, by all means pay attention to those details. If you're writing at higher level, consider abstracting away such details in an interface, library or module.

I miss Wildcat BBS as much as the next person: by which I mean, not very much. HN is full of really fucking smart people not the idiots implied by your comment.

Huh? I think you missed something.

This is nothing to do with BBS's or code abstractions. On the former, there is no OSI stack; it's terminals down serial connections. On the latter, it's datagrams or sockets. It's about the guarantees that the link layer makes or doesn't. Parity doesn't pass up the layers because the guarantees are made further up (TCP).

You can still run token ring, serial, thick ether, thin ether, paper aeroplanes thrown between buildings. It doesn't matter above the data link layer.

Yes there are really fucking smart people here, as you put it but it appears there is a normal distribution of people as well.

If you call opening a new secure stream in one RTT rather than N RTTs "badly".
Where's the research that says that the connection overhead is destroying humanity?

Back when we I had a 14.4k SLIP dialup and RTT of 200ms+ connection overhead and TCP channel overhead was a major drag on throughput but it's not like that now. I'd be surprised if there was a tangible difference to the end user.

> Where's the research that says that the connection overhead is destroying humanity?

It's destroying big business who push lots and lots of resources to the browser:

- From an admin POV, you have to shard your domain => more work, more maintenance.

- From a browser POV, you have to open multiple TCP connections => you take slow start and TLS handshake in your face for each connection + the connections have to fight each other because the OS wants to be fair among TCP connections

- From a web admin POV, you want to inline your content to reduce round trips => you have more work to do on your resources

SPDY is certainly not necessary for everyone (it mostly benefits those who push lots of different resources), that's true. We're talking about businesses who lose a month worth of revenue if the latency to their site explodes from 50 ms to 500 ms.

But it still is interesting because the actual usage _on top of HTTP_ doesn't change: you still have your websockets or your Server-Sent events, you still have your keepalive, you can do a simple-stupid "one HTTP call per resource" and it will be handled efficiently, sometimes SPDY will work underneath to push content so that the next HTTP call will actually hit the cache without you knowing about it... all at the cost of changing (or updating) your library. Because you certainly don't write HTTP text directly to your TCP socket.

The interesting point will be for those library developers. The added complexity will certainly make it harder, but on the other hand the binary format and strict rules will make it easier to parse the messages... I'd like to see where it goes here.

But not quite. HTTP/2 is like a thread and TCP is like a process. The priority of a process and be raised or lowered affecting all threads in it.
The issue is that HTTP/2 is basically working around deficiences in TCP, and doing it badly, because it appears to be easier to get buyin for that than for fixing TCP or deploying alternatives.