| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dcsommer 4773 days ago
	Pipelining falls short of SPDY in several respects. The biggest problem is that it suffers from head of line blocking. One slow request or response prevents others from making progress.

2 comments

osth 4773 days ago

I trust in theory this is true, but I've never personally observed this in practice.

I guess SPDY fans' marketing of this "feature" would be more convincing if I could see a demonstration.

I just don't see any noticeable delays when using pipelining.

What strikes me as peculiar about the interest in SPDY is that I never saw any interest in pipelining before SPDY. And I really doubt it was because of potential head of line blocking or lack of header compression. I think users just were not clued in about pipelining.

The speed up between not using pipelining and using it is, IME, enormous. 1 connection for 100 files versus 100 connections for 100 files. It is a huge efficiency gain.

Yet most users have never even heard of HTTP pipelining, or never tried it. If they really wanted such a big speed up, why wouldn't they use pipelining, or at least try it? Why wouldn't they demand that browsers implement it and turn it on by default?

Users are being encouraged to jump right into SPDY, a very recent and relatively untested internal project (e.g. see the CRIME incident) of one company, most users, if not all, having never previously experimented with even basic pipelining, which has been around since the 1999 HTTP/1.1 spec and has support via keep alives in almost all web servers.

Noticeable speed gains would be seen if www pages were not so burdened with links to resources on external hosts. That's what's really slowing things down, as browsers make dozens of connections just to load a single page with little content. The speed gains from cutting out all that third party host cruft would make any speed gains from avoiding theoretical potential head of line blocking during pipelining seem miniscule and hardly worth all the effort.

If you want to see how much pipelining speeds up getting many files from the same host, you do not need SPDY to do that. Web servers already have the support you need to do HTTP/1.1 pipelining. (Though on rare occasions site admins have keep-alives disabled, like HN for example. In effect these admins are saying, "Sorry, no pipelining for you.")

link

akalin 4773 days ago

HTTP pipelining is turned off by default in most browsers due to concerns with buggy proxies and servers (see https://bugzilla.mozilla.org/show_bug.cgi?id=264354 ). It may work for you and the particular set of servers you visit, but I suspect browser developers would rather have a browser that by default works with the widest possible range of configurations.

Unfortunately, it being turned off by default in most browsers means that most people won't see the benefits from it. Hopefully, the upcoming HTTP/2 standard will fare better (latest draft: https://tools.ietf.org/html/draft-unicorn-httpbis-http2-01 ).

Note that HTTP/2 will be based on SPDY (in particular, SPDY/4 with the new header compressor). Hopefully, when the standard is finalized and we have multiple strong implementations, that will allay the concerns you seem to have with SPDY today.

(Disclaimer: I work on SPDY / HTTP/2 for Chromium.)

link

osth 4773 days ago

Yes, I understand there are buggy servers and proxies... and I use a browser that has settings to accomodate them. However... I do not know about HTTP bugs that affect <emphasis>pipelining<emphasis>. And... in addition, for pipelining, I do not use a browser to do the initial retrieval. I use something like netcat to fetch and then I view the results with a browser.

Can you give me a list of buggy servers where my HTTP/1.1 pipelining will not work as desired? I've been doing pipelining for 10 years (that's quite a few servers I've tried) with no problems.

The arguments made by SPDY fans (e.g. Google employees) all seem plausible. But I wonder why they are never supported by evidence? IOW, please show me, don't just tell me. SPDY seems to solve "problems" I'm not having. Where can I see these HTTP/1.1 pipelining problems (not just problems with browsers like Firefox or Chrome) in action? I'd love to try some of the buggy servers you allude to and see if they slow down pipelining with netcat.

link

akalin 4772 days ago

I didn't have to look hard to find bug reports for pipelining. An example is https://bugs.launchpad.net/ubuntu/+source/apt/+bug/948461 for Amazon's S3. I'd be interested if the problem is still reproducible now. Also, one of the comments mentions Squid 2.0.2 as being buggy.

Also, see https://insouciant.org/tech/status-of-http-pipelining-in-chr... for a link to Firefox's blacklist of buggy servers (and a good discussion of pipelining in Chromium).

Most of the improvements in SPDY are latency improvements, so if you're downloading sites with netcat and then viewing them in a browser, I'm pretty sure the overhead of that would dwarf anything SPDY would save. That having been said, there's ample evidence of SPDY improving things. From http://bitsup.blogspot.com/2012/11/a-brief-note-on-pipelines... :

"Also see telemetry for TRANSACTION_WAIT_TIME_HTTP and TRANSACTON_WAIT_TIME_HTTP_PIPELINES - you'll see that pipelines do marginally reduce queuing time, but not by a heck of a lot in practice. (~65% of transactions are sent within 50ms using straight HTTP, ~75% with pipelining enabled).... Check out TRANSACTON_WAIT_TIME_SPDY and you'll see that 93% of all transactions wait less than 1ms in the queue!"

link

osth 4771 days ago

Thanks for the reading material.

You omitted the sentence before your excerpt where Mr. McManus suggests we move to a multiplexed pipelined protocol for HTTP.

I'll go further. I say we need a lower level, large framed, multiplexed protocol, carried over UDP, that can accomodate HTTP, SMTP, etc. Why restrict multiplexing to HTTP and "web browsers"? Why are we funnelling everything through a web browser ("HTTP is the new waist") and looking to the web browser as the key to all evolution? It seems obvious to me what we all want in end to end peer to peer connectivity. Although the user cannot articulate that, it's clear they expect to have "stable connections". This end to end connectivity was the original state of the internet. Before "firewalls". Client-server is only so useful. It seems to me we want a "local" copy of the data sources that we need to access. We want data to be "synced" across locations. A poor substitute for such "local copies" has been moving data to network facilities located at the edge, shortening the distance to the user.

But, back to reality, in the case of http servers, common sense tells me that opening myriad connections to (often busy) web servers to retrieve myriad resources is more prone to potential delays or other problems (and such delays could be due to any number of reasons) than opening a single connection to retrieve said myriad resources. Moreover, are his observations are in the context of one browser?

I guess when you work on a browser development team, you might get a sort of tunnel vision, where the browser becomes the center of the universe.

If you dream of multiplexing over stable connections, then you should dream bigger than the web browser. IMO.

I'm aware of a bug in some PHP databases with keep alive after POST. I mainly use pipelining for document retrieval (versus document submission) so I am not a good judge of this. What I'm curious about is where keep alives after POST would be desirable. You alluded to that usage scenario (a series of GET's after a large POST).

link

akalin 4771 days ago

Re. Patrick's sentence, you're right, but as I mentioned above, SPDY/4 will become HTTP/2 (we're working through the standardization process). So I think most of the major players are on board with "fixing" HTTP pipelining by using SPDY-style multiplexing.

Re. thinking bigger, you might want to read up on QUIC, which was announced recently: http://en.wikipedia.org/wiki/QUIC . Based on that, I would content that at least we on the Chromium team don't have tunnel vision. :)

Re. your question, Patrick's data is from Firefox only I believe. You're right that it's not surprising his stats show that SPDY helps over HTTP without pipelining. But the more interesting thing is that HTTP with pipelining still doesn't help that much over HTTP without pipelining (on average) and SPDY still beats it by orders of magnitude. I'd have to dig, but I'm pretty sure there are similar stats on the Chromium side.

link

grey-area 4773 days ago

Osth you appear to have been hellbanned, you should ask for this to be fixed. I see nothing meriting it in your comments.

link

thwarted 4773 days ago

This isn't a problem when the primary request is dynamic and served from one server/domain/connection and the remaining requests are for static assets stored on and served from another server/domain/connection.

link

dcsommer 4773 days ago

Consider if the client makes a large POST followed by a few GETs. If the client has little upload bandwidth, the GETs will be delayed until the POST completes. With SPDY, they all can proceed concurrently. Similarly, if the client makes 5 GET requests, with the first being a heavy/slow/expensive resource for the server, the cheap resources can't be delivered until the slow resource finally is computed and returned.

link

thwarted 4772 days ago

Yes, that's the reason that SPDY exists, but my point is that that's actually a rare implementation in the real world. As was said elsewhere in this thread, and what I was saying, is that the most likely implementation is that one request goes to, say, www.example.com, which serves a single HTML file, and remaining requests for resources that that HTML references go to outsourced-cdn.example.com. So it's actually more important to have pipelining and SPDY support on outsourced-cdn.example.com than it is on www.example.com. That is, chances are you don't need to worry about it, that's why you pay outsourced-cdn to. There is less of a need for multiple simultaneous requests when you have good client-side caching of resources too. The usefulness of multiplexed requests is negated if you only serve one request.

The above has been the exact case at a number of companies I've worked at.

Sites and companies like Google or Facebook, that serve all their own traffic, it becomes more important for.

link