Hacker News new | ask | show | jobs
by geforce 3394 days ago
Was talking with colleagues about cloudbleed and S3 problems yesterday.

I don't feel like many people are actually concerned of the implication of having an internet that isn't an internet anymore, but merely a handful of big companies hosting everyone.

Or maybe it's me who don't understand.

6 comments

20 years ago, sites got "slashdotted" left, right and centre. Cloudflare and friends put an end to that, forever. Sites went offline for days if not weeks because of hardware failure -- some never came back because they didn't back up correctly. S3 went offline for five hours, but didn't lose a single bit of data.

You're looking at the past through rose-tinted glasses. We learn all the time, and we will probably also learn to build some resilience into our systems against these issues. But as someone who's been through a thing or two (see above) on the Internet of yesterday, I like the one we have today.

Cloudflare had nothing to do with ending slashdotting. This stopped being a problem years before they existed.

Slashdotting was mostly a problem caused by Apache's incredibly inefficient design. It consumed huge amounts of memory per connection at a time when most of us had very slow connections. A link from Slashdot was, in effect, a Slowloris attack on your server.

The big change was moving from a fork/thread-based webserver (Apache) to an event-based webserver (nginx), which was made even more efficient by kernel features like epoll.

Sorry to be blunt, but this is just incorrect.

The problem with "Slashdotting" was the number of concurrent connections. Heck a fair portion of the time it was the database that keeled over first, not Apache.

Slowloris attacks send purposefully incomplete requests and hold them open with additional headers. Even with dial-up modems, connections were never slow enough for this to be a problem with actual requests, which are lightweight.

Responses are heavy and can tie up slow connections, especially if they have to go get stuff out of the database. But in that case it's no longer a Slowloris type attack. It's just too many concurrent connections.

The Slashdot effect was solved with static HTML caching, simply because caches are faster and don't touch the DB. Cloudflare is a simple, free example of such a cache, although certainly not the only one.

Bluntness is okay, but aren't you wrong about me being wrong here?

I didn't say it was a Slowloris attack. I said Slashdotting was "in effect" the same thing. Which it is, both problems are one of exhausting limited concurrency.

> The problem with "Slashdotting" was the number of concurrent connections.

Exactly the same problem a Slowloris attack exploits.

> Responses are heavy and can tie up slow connections...

Yes, responses tie up the limited number of available httpd processes.

The problem was that Apache couldn't even serve static files to many clients because of its heavy weight httpd processes and the fact that clients were so slow.

If your web server can only handle 200 concurrent connections, and you want to serve a 500 KB screenshot of your 1337 Linux desktop to clients that download at 3.5 kbyte/s, you can handle like ~1.4 req/s. Doesn't take much to get Slashdotted.

Whereas, event-based webservers could handle at least 10x more connections on the same hardware even before epoll existed.

I had this problem in 1998 and fixed it with select/poll based servers, and then eventually other epoll-based servers before nginx existed.

If we go back to 1998, maybe network throughput was the limiting factor that drove up concurrent connections and killed servers. But I also don't think we can say nginx solved that, since nginx didn't start seeing wide usage until about 10 years later.

I guess I wrote what I did because the comparison to Slowloris seemed to over-emphasize the importance of handling high numbers of concurrent connections, since that is the only mitigation for Slowloris.

But, for a flood of real traffic, concurrent connections and throughput are related. The faster your web server can serve responses, the fewer concurrent connections it will need to handle. And as the percentage of dynamic DB-backed sites has increased over time, so has the value of caching. Basic page caching can speed up a Wordpress blog by hundreds of times for unauth'd users, for example. For most little sites, implementing caching will get them more than installing nginx.

And really, what good are valid concurrent connections if the throughput isn't there? For most users, a site that waits 5 minutes on a blank page is no better than a server that's down.

One of the main causes for that was a one line fix in the apache config:

 	KeepAlive Off
Yup!
> This stopped being a problem years before they existed.

I still see this issue happening daily here on HN.

And every time it happens 10 people jump to comment about how the site is written poorly and could probably be an entirely static site.
This sounds plausible, but is there any way to validate this with some kind of empirical evidence?
Some website being Slashdotted is not in the same class of issues as users' (possibly encrypted) page content being spread all over the web. Also, the notion that you either need to use some super-centralized platform or be susceptible to Slashdot effect or loosing backups is a false dichotomy.
>>Sites went offline for days if not weeks because of hardware failure -- some never came back because they didn't back up correctly. S3 went offline for five hours, but didn't lose a single bit of data.

S3 going offline for five hours probably caused orders of magnitude more damage. The reason is simple: the stakes are higher today than they were 20 years ago. Back then the overwhelming majority of the business in most companies was conducted on paper and in person. Losing a website for a few days, or even a few weeks, wasn't a big deal. Today though? There are so many companies that host business-critical operations and infrastructure in the cloud that it's hard to fathom how they are coping with being taken offline for several hours in the middle of the week.

Wait. How does Clouldflare prevent a site from going offline for weeks because of hardware failure?
S3 is the reference here, not Cloudflare.
Nonetheless, CloudFlare caches sites and may serve a cached version of you site when it's down, assuming it's "static" enough to be readable from cache.
I'm sorry, but I disagree that we have an internet that isn't an internet.

Even though two particular problems affected a great many sites, they didn't affect every site or even a majority of sites. That's because we don't actually have a "handful of companies hosting everyone."

Maybe I'm not a typical user, but the only site that was affected by S3 that I use was HN. The only three sites affected by Cloudbleed that I use were HN, Discord, and another smaller site elsewhere. The rest of the internet worked perfectly fine for me.

I'm pretty sure 90% of the sites typical users were unaffected also. I mean, facebook didn’t go down, did it? People could still search on The Google. I don't know of anyone whose job was affected by this aside from people who run websites. Neither of these were as bad as the Dyn outage, which wasn't the apocalypse either.

I agree that over-centralization of the Internet to be concerned about actually creating, and it's definitely something you should consider before choosing services like Cloudfare and S3. But I don't think it's so bad today that the Internet isn't the Internet anymore.

It's almost as if self-reinforcing feedback loops are all over society these days. Being at the top makes it that much easier to stay at the top and crush everything else. Internet companies are just one small slice of this trend.
antitrust is all bark with no bite these days. Most industries having 3-4 "competitors" that don't actually compete is the new norm. I wouldn't be amazed if a link is shown between this situation and the increasing wealth inequality
Antitrust laws were remarkably different in the recent past. Here is an excerpt from http://washingtonmonthly.com/magazine/novdec-2015/bloom-and-....

"To get a flavor of how thoroughly the federal government managed competition throughout the economy in the 1960s, consider the case of Brown Shoe Co., Inc. v. United States, in which the Supreme Court blocked a merger that would have given a single distributor a mere 2 percent share of the national shoe market."

Beyond simple anti-trust, in many cases, we need to reform legal structures so that big players can't use them to block out startups.

It's practically impossible to defend yourself well in a lawsuit against any respectably-sized company without a few million lying around, first of all. That affects everything, and big companies use that fact to bully upstarts and other small innovators into shutting down all the time.

My own business was effectively shuttered (had to stop selling our primary product) by a C&D from a Fortune 100. It would've been 5-10 years and ~$5 million to see that case all the way through, and under current precedent, it's very likely I would have lost.

Aside from that, industries frequently get laws and rules put in place with ostensibly-reasonable rationale, while the actual intent is to make it virtually impossible for disruptive competitors to enter the marketplace.

The CFAA is the piece of legislation that primarily enshrines entrenched players in the online space. We also need to reform copyright law and clarify some matters regarding the applicability of EULAs, especially with regard to clickwrap and browsewrap.

Once that's done, the flood of innovators that have been held back by big companies dispatching their law firms will finally be able to contribute, and the internet's competitve landscape will truly be back in the hands of the users. It will shift from "Who has my data? I have to go with them" to "I can use any interface I want to access that data", effectively resolving the chicken-and-egg effect that imperils any potential competing social network (not even Google could compete with Facebook on this!).

> It's practically impossible to defend yourself well in a lawsuit against any respectably-sized company without a few million lying around

Large corporations keeping a bunch of lawyers on their payroll are the equivalent of large nation states stockpiling nuclear weapons.

That's because you're viewing this situation entirely from within your prior beliefs about large corporations, which somehow have convinced you that only 3 or 4 hosting companies exist and everyone must use them.

Exactly how do you figure this? There are lots of alternatives to S3:

1. All the cloud stuff, which is the new fangled hotness (3 to 4 companies)

2. Non-cloud hosting providers (hundreds of them, from shared hosting to VPSs and above, and usually cheaper if you don't have high traffic)

3. Hosting your own stuff on your own hardware, which is the cheapest if you really need it, and can be done in a data center just about anywhere in the world with good internet.

So, what planet is this that you live on where 3 to 4 companies handle all hosting and we have to use them?

Man, back in my day the internet was even LESS of an internet than it is now. Very few companies had websites, and very few hobbyists could afford to run a website (there were few hosting companies, and it cost $100 to register a domain name with the only registrar around, Network Solutions). Running an internet presence was EXPENSIVE.

Sure, there has been consolidation in the industry, but it is much easier to exist on the internet now than ever before.

> I don't feel like many people are actually concerned

Judging by how frequently I read this sort of comment on HN, there are definitely many people who are concerned here.

What are you proposing? We should all serve off of our DSL lines?
Collocation, dedicated servers, even VPS. There's plenty of a middle ground between serving off your home server and AWS, Cloudflare et al.
And to be blunt, I'd even claim that the middle ground is superior for most people than AWS, Cloudfare et al.

I have limited experience with AWS, but they always seem to be having problems that ruin everything that they don't acknowledge. That's what I get from people at the office who have to use it to support a particular customer. Nothing we do ever seems to run properly on it, even though our software is fine everywhere else. We're trying to push that customer off to using their own hardware, and they agreed with us that it's necessary but for different reasons.

Cloudfare, I don't even understand why anyone would use it. I understand the benefits it claims to provide, but you can get those without having an external MITM proxy that can spew information all over the Internet with one bug.

Ah! Maybe that would encourage people not to make 15mb pages doing dozens of requests!
I wish. Too bad there isn't any meaningful competition in the ISP space for most people.
that would be cool