Hacker News new | ask | show | jobs
by zelphirkalt 1305 days ago
By now Cloudflare is more of an obstacle to the free web than it is helping. A centralized entity, whose scripts from randomly named subdomains you must allow to run on your machine, or be stuck at their obnoxious "checking your browser" page endlessly reloading, because some web dev decided to put their website behind Cloudflare. Cloudflare is one of the most prominent reasons for me to simply close the browser tab and leave the site.
8 comments

On a theoretical level, a service like Cloudflare is the most terrifying entity on the Internet I'm aware of. They've accumulated an insane degree of insight into the traffic flow of the web (since their entire service is essentially acting as a HTTPS middle man), and their business is offering protection against bot spam that could ruin most websites. Even if they aren't operating the bots themselves, they're essentially displacing the bot problem to the unprotected websites. Like the overall shape of this operation is something the cosa nostra could have cooked up in the 1970s.

However, being on both sides of this, both operating a bot for my search engine, and operating a web service that is aggressively targeted by bots. They're not actually bad to deal with.

The big unanswered question is how they'll manage to stay good given the obvious incentive of abusing this setup. Maybe this CEO has a moral backbone, but will the next, and when they're acquired by the Meta-Amazon-Alphabet group in 15 years, will they still stick to these principles?

Internet security has, in my experience, always been about "being just hard enough a target the bad actors decide to go torment somebody else."

It was true twenty years ago too, the only difference I can see between then and now is that you can outsource that task for a (relatively) small amount of money if you want to.

Then again, the last time I dealt with a site under DDoS, something in their stack was leaking the underlying IP (never did figure out what) but it turned out that "finding a provider who'd sell them a decent sized server and charge them for the bandwidth" was perfectly economical for their use case because their haters' firepower was insufficient compared to their revenue.

(I'd love to be less vague here but I'm sure readers can see the obvious professional ethics issues with doing so)

I'm surprised you're handing incoming requests from everybody. We only process the CloudFlare ones and drop the rest.
You can fill the pipes to the server(s) you're targeting, it doesn't have to be application layer.
These days, Cloudflare lets you serve your origin via a tunnel from a host that doesn't even have a public IP.

And if you run that in a cloud, the NAT isn't your problem -> your attacker will have to DoS that cloud as a whole.

That's an extremely smart approach that I sincerely doubt the site operators would have been capable of dealing with.

Part of the art of consultancy is, sometimes to my great annoyance, optimising for "within the customer's budget" and "within the customer's capacity to maintain it after I'm no longer involved" over "best possible solution."

Plus in this particular case I was working pro bono because (a) I quite liked the site in question continuing to exist (b) a Shadowcat alumnus asked me nicely (c) I take great pleasure in ruining a griefer's entire week. So lightest possible touch was strongly indicated.

The end result was not remotely clever, but it's been in production for a while now and has not to my knowledge caused financial or uptime issues, so I'm going to call it a win even if the inelegance of -how- I won continues to irritate me ;)

Right, "finding a provider whose reaction to an aggravating quantity of incoming packets was charging money rather than throttling the connection" was basically the load bearing part of the solution here.

Fortunately, while said quantity was indeed aggravating, it was low enough that the cost was financially and logistically less than trying to do something more elegant.

Sometimes brute force and ignorance is, in fact, the right answer, and I don't have to -like- that being true for it to be true.

>The big unanswered question is how they'll manage to stay good given the obvious incentive of abusing this setup.

Why do you think they're still "good"? CloudFlare has chosen to abandon sites that held free speech (abhorrent speech, but still free speech) while still protecting forums upon which credit cards and methamphetamine were listed for sale on the front page.

To me, that's not a sign of a "good" actor.

Free speech doesn't exist within the context of a privately held website.
Free speech is an ideal, not just an amendment.
but a private entity (person or corp) does not have any obligation to protect ideas they find abhorrent to be considered on the side of "good".
That's the point. They find free speech abhorrent but consider selling dangerous drugs to be acceptable.

So many people find them abhorrent because they represent different values.

Legally no one has any obligations here.

Agreed, but it's a strange value system that says, "Dealing meth and stolen credit cards is okay, but having a web forum that makes fun of people is not."
I always figured that the main thing Cloudflare protected against was DDoS attacks, not bots (DDoS may be caused by bots, but with significantly different outcomes -- a single bot in and of itself won't take down a website)

RE bots: TikTok has incredible bot protection that comes from engineering (webmssdk) instead of network-based filtering. I'm not even sure if they use Cloudflare.

Cloudflare doesn't even really protect against DDOS. Sometimes taking your website off Cloudflare is the only way to stop a DDOS attack. That's because you can't stop something like a level 4 ddos attack by blocking the IPs in raw prerouting iptables, because if you did that then you'd be blocking Cloudflare's IPs. The only option Cloudflare really provides you is pressing a panic button that forces everyone who visits your site to view a captcha, when it's really so trivial to just run the iptables commands using a token bucket algorithm. I know because I run a website on a 2 vCPU VM that gets DDOS'd all the time. I've had to block over nine thousand malicious malicious IPs so far. I tried using Cloudflare in the past for their protection services, but it made me (1) defenseless against bad visitors and (2) made good visitors angry at me for the captchas.
How did the attackers get your origin ip to begin with? I thought cloudflare was supposed to shield it at the DNS level, and in theory your origin should be dropping all connections not coming from an authenticated Cloudflare proxy?
They weren't able to talk to my origin IP, because when I was using Cloudflare, I blocked at the firewall all IPs that weren't Cloudflare. The problem is that they would DDOS my server through Cloudflare. And because the traffic was being proxied, I couldn't block the attackers without blocking Cloudflare. Unless of course I wanted to fill out a form on their website 9,000 times. It's an awesome website by the way. I love their workers and r2 products. But Cloudflare honestly isn't that good at DDOS protection. These attacks were so bad that Cloudflare would start showing NGINX error pages before my web app even went down. Cloudflare should be paying me to protect them, rather than the other way around.
Do you have a support ticket # you can email me w/details (pat at cloudflare)?

We take every reported false negative as an opportunity to improve our DDoS mitigations, and these reports are very helpful.

As of a few weeks ago, you can now report FNs/FPs for Bot Mitigation directly in the dashboard, and we'll be expanding this pattern for use with DDoS Mitigation as well.

They do both. Ddos mitigation happens at the network level, while bot protection uses a combination of whitelists, blacklists, behavioral heuristics like mouse movements, login state, and captchas.
ALL big tech companies have the same setup. There is nothing unique with Cloudflare. People are just talking about Cloudflare cause it is accessible for free and they sell it as a service.
He has shown time and again that his backbone's strength depends on how loud the public noise is. Kiwifarms most recently. You can dislike them(kiwifarms etc) and there is a case for them to be taken offline imo, but it is the governments job.

Exactly what you do _not_ want protecting the neutral internet. They've done better being neutral than some might have, but that's in reality more insidious because clearly there are points they will bend on and those points will change over time and almost certainly continue to erode.

I never really understood Cloudflare's intent, because from the marketing material it seems that you get DDOS "protection", free TLS certs, everything in a monthly package, affordable, bla bla bla.

But from some basic calculations I get that R2, Workers and egress bandwidth beyond a few terabytes costs just as much as Oracle cloud / Alibaba.

But what I dislike the most is how little control you have over what's going on there. Like: If you haven't setup TLS on your webserver, why do they allow unencrypted traffic to flow between the server <-> Cloudflare and encrypt it to the end users and pretend that is secure?

Why can't they forward all my server's headers? Why <XYZ> ?????????

Read some horror stories on Hackernews and you'll quickly find out what their "unmetered bandwidth" really means. You get very little if any transparency about the pricing, which I would except from tiny cloud companies, but this is supposed to be a major one!

I think the ability to put TLS in front of a non-TLS'd website comes of a few properties:

1. It's probably better than nothing. 2. It's a legacy thing.

A company like Cloudflare has to make a choice - how frequently do we break users who've set up their site in a way that is no longer in line with security best practices? It looks like the decision they've made is to break infrequently. Certainly the site I set up in 2014 when their free TLS was new still runs, and I haven't made changes.

I believe that you can set up strict TLS between Cloudflare and the end host if you choose, but it's up to you. I think in that instance, your 'little control you get' is actually more control, no?

And, if you look back even a few years, TLS was both uncommon and expensive. Cloudflare was a pioneer by offering free TLS certificates in I think 2014 (only 8 years ago!). LetsEncrypt started in 2015 and was niche for quite some time. I think even now you can find Linux distros preferring to ship their data over HTTP with GPG-keys recommended for the security. Of course in 2022 even simple sites should be TLS'd, but Cloudflare's existed for a while.

And, TLS to the client but plaintext from CDN to site is still better than cleartext the whole way, because it (generally) stops the ISP from snooping on its customers.

    I think even now you can find Linux distros preferring to ship their data over HTTP with GPG-keys recommended for the security.
This isn't really to solve the same problem though. The GPG key thing is so you can use mirrors for hosting that are distributed but still trust the package came from the real source. TLS termination of where the packages are retrieved is separate.
Yes, the gpg piece provides that functionality nicely. However, it’s exceedingly common for the mirrors to not be provided over TLS for cost reasons. Netflix switched to serving video over TLS for no other reason than to promote the usage of TLS (after a lot of custom engineering (pki on cpu, crypto on nic iirc?) to reduce the overheads of doing this.
A few TB/mo is quite enough for a lot of smaller companies, and DDoS protection is something that a smaller company can see as a pretty valuable thing. A CDN with thick worldwide presence does not hurt either. So using Cloudflare is a no-brainer for a smaller business, especially with the prices they offer. Not using Cloudflare means either buying separate DDoS protection (likely offered by your cloud provider), or risking an extortion attack.

Some competition exists, but it's both more expensive and less reliable and convenient.

The two actual whys you have posted are settings you can change in the cloudflare config.
> But what I dislike the most is how little control you have over what's going on there. Like: If you haven't setup TLS on your webserver, why do they allow unencrypted traffic to flow between the server <-> Cloudflare and encrypt it to the end users and pretend that is secure?

I don’t get the issue here. The traffic between client and Cloudflare is secure. SSL is terminated at Cloudflare. You can choose to have end to end security if you want.

If you set up your own frontend that terminates SSL, but choose not to secure the traffic to your backend, the end client will still see the connection as secure.

Can't you use "Strict Origin" cert on Cloudflare? Here is a pic of my settings: https://i.imgur.com/aHQ1U1L.png

Sorry if I am missing something here. Cloudflare gives flexibility to their customers. That seems right.

Cloudflare enterprise is pretty transparent if you've gone through the sales process. They tell you exactly what the limits are. For average person, on free plan, they are not obligated to provide details of where the limits are. That's no different than BackBlaze unlimited storage plan.

I agree that it is difficult to know exactly what you are paying for but they are very affordable.
> If you haven't setup TLS on your webserver, why do they allow unencrypted traffic to flow between the server <-> Cloudflare and encrypt it to the end users and pretend that is secure?

I Really Can't Think of Any Reason

I remember working on denial-of-service protection code for an embedded device.

One problem was that if the code was TOO aggressive in protecting from a denial of service attack, you could actually help an attack or be the culprit yourself by denying legitimate traffic.

I think this is what cloudflare is doing. They are imprecise and they are denying legitimate traffic.

I don't think that ever happens. If anything they are too lenient. Our own alarms kicks in way before Cloudflares DDOS protection is activated.
Crawling websites behind Cloudflare can also be problematic if they (CF) decide that your bot doesn't fit their definition of OK. This is problematic for new search engine entrants and a multitude of other services, particularly given how many sites now live behind CF.

Years back their DNS service also stopped honouring ns_t_any requests (for reasons of DDOS amplification apparently).

I do tend to agree with you about centralisation, gatekeepers particularly.

You can't run around and crawl other peoples sites. That time is long gone.
Not sure if you're being sarcastic!

In the end for any scraping they're just raising the barrier of entry. Automated browsers, residential proxies, captcha services just make it more involved for those determined to hit a URL successfully.

Not necessarily a bad thing, but the line and grey area and the definition of a 'legitimate' request varies, and one entity as a middle-man deciding that is less than ideal.

Not sure where you coming from but let us go back 10-15 years when there was an open market for commercial crawlers and IP ranges to be used for it. You sold shoes and scraped all other competitors for instance. That era is over.

For legitimate interests today including search engines and services for price comparison that data is often provided for free.

There are design patterns used today that does among other things provide incorrect prices to scrapers.

Scraping is illegal in most western countries btw.

>For legitimate interests today including search engines and services for price comparison that data is often provided for free.

Can you explain that further in the context of search engines, new or existing, need to crawl websites and Cloudflare are a barrier to entry? You seem to contradict yourself.

>There are design patterns used today that does among other things provide incorrect prices to scrapers.

If you say so, and hopefully they do it 100% correctly.

I still remember when they posted some articles about those pages wasting time

https://blog.cloudflare.com/introducing-cryptographic-attest...

while they are main reason (in my browsing at least) the "verification pages" happen.

One thing I noticed was how cloudflare branding used to be pretty prominent on those pages, and now is pretty small.

I think they probably realized that maybe they don't want to be known as the reason these pages are showing up everywhere and inconveniencing legitimate traffic.

I don’t know… the so called free web is also a bot paradise, and like it or not cloudflare is actually helping mitigate it to some degree. It comes with a cost but maybe it’s worth it?
Cloudflare has played a major part in making VPNs suck, by providing a service that actively blacklists VPN IPs and selling companies on integrating the VPN blocker into their services.

It's probably true that some VPNs are used for nefarious stuff, but it's also lame that Cloudflare is such an anti-privacy warrior.

The web basically relies on bots to exist, search engines wouldn't work without them, Archive.org uses them to archive the web, etc.

It would be interesting to know what percentage of bots are actually nefarious.

One of the reasons why it became like this is because there is no protocol that would allow a host to request blocking traffic from other host on upstream provider (so that malicious traffic is blocked close to originating network). If there was such protocol, site owners could protect from attacks themselves, but without it you have to use Cloudflare unless you are Google scale with channels wider than attacker's.
What scripts from random subdomains are you referring to? I know that from Cloudfront (Amazon's CDN), not Cloudflare. CF usually keeps everything on your domain.

The "checking your browser" isn't a default CF thing btw, that's up to the site owner and how paranoid they are (with or without reason). It's annoying me too, but we have sites on CF and practically nobody sees any checks when they access our sites.

Good to know that this page is due to the cloudflare customer! I am only seeing the results of that paranoia in my daily browsing and it sucks. I recently had to ban Gitlab into its own browser profile, because with my previous main profile settings, it simply wouldn't let me log in. I am treating it from now on as contagious, because of that "checking your browser" bs.

(I did write a support request message to Gitlab, but their support clearly sucks. What do I know what kind of subscription my employer has? I don't care! They are paying for me, so Gitlab should offer a modicum of support, if I cannot even log in on their shitty site any longer, because of their changes. But they stonewalled with something like: "We need to know your subscription level blablabla before we can continue the process." kinda automated e-mail. Well, duh! Check your friggin database for my subscription level. Oh but then you would actually have to work. Ah that's a problem of course. Better stonewall a paying (paid for) customer.)