Hacker News new | ask | show | jobs
by WheatMillington 1052 days ago
You're being a little dramatic. It's incredibly unlikely that millions of innocent users have been blocked, and unless you have data to the contrary you shouldn't make such a claim.

You know what else is harmful to the concept of the open internet? The enormous malicious botnets and other endemic problems that require a solution like CloudFlare.

6 comments

Data point of N+1, but I haven't been able to place online orders at Petco for about a year now because they use some Cloudflare feature that hates my browser + home internet connection. Other Cloudflare-proxied sites seem unaffected, and I'm not doing any botting/crawling, nor do I have any IoT devices on my home network. There's not enough information provided to be able to do any substantive troubleshooting.

This became irritating enough that it caused two side effects: (a) I stopped shopping at Petco, and (b) I moved a pile of sites off of Cloudflare and stopped recommending them, and now sometimes recommend against them.

Cloudflare is still a good, quick, cheap option for sites that receive unusual volumes of malicious traffic, so I'll still recommend them as a solution to some problems. But, they're not a good default.

So you're mad at Cloudflare because Petco enabled a feature that blocked you? If Petco had developed something in-house that blocked you, would you be mad at the compiler?
Cloudflare offers this service. If Cloudflare offered a service that enabled Petco to do something amazing would you be grateful to Cloudflare? If Cloudflare advertised on its homepage about blocking a DDOS attack on a website would you say, "meh, Cloudflare wasn't responsible for blocking that attack, they only provided a feature. The website blocked the attack."? If not, then why should Cloudflare be immune from criticism when the opposite happens?

Cloudflare offered Petco the features to do this as a product and makes money off of Petco's usage of those features. I do sympathize with the perspective that ultimately tools need to be somewhat neutral and it can be dangerous to forward around responsibility. But "tools are neutral" can also be taken to an absurd degree. This isn't 5 levels of indirection here and it's not Petco going and installing a neutral piece software that they downloaded from Github. Petco is a client. They're turning on toggles that Cloudflare built into their user interface and advertises as features.

There's some level of moral accountability there for how those features are abused. I'm not saying it should be illegal, I'm not saying it shouldn't be allowed, but Cloudflare is definitely at least eligible for criticism. This is a product, it's not Petco abusing Cloudflare's infrastructure; they're using the product as intended and advertised.

...no, I've changed my recommendations for Cloudflare because it may prevent ordinary users from using a site, and insufficient information is provided for troubleshooting purposes, and those users are likely not going to go to extraordinary lengths to report the problem. Even if they do report it, the site won't be able to troubleshoot it either. So, if you don't need it, you're probably better off without it.
> It's incredibly unlikely that millions of innocent users have been blocked

Is there a 'town square' where we can talk about being presented captchas and similar things from 3rd party intermediates.

I think it's incredibly likely that millions of hours have been wasted on such challenges.

On that note...

https://www.folklore.org/StoryView.py?project=Macintosh&stor...

"Well, let's say you can shave 10 seconds off of the boot time. Multiply that by five million users and thats 50 million seconds, every single day. Over a year, that's probably dozens of lifetimes. So if you make it boot ten seconds faster, you've saved a dozen lives. That's really worth it, don't you think?"

Imagine if people still thought like this about computers and software.

Yes. And cookie splash screens! I admire GDPR's intention but hasn't it been a massive human time sink.

Not to take away from your point, just that it's all a hindrance.

That's more on the websites that track your personal data for non-essential purposes. No tracking means no banners are necessary.
Finer points, my point is just about people wishing to view web pages.
I don't know that most web admins can tell if they should float a banner, so vague is the law.

Technically, I think if you have the default Apache logging configured and you read those logs, you should probably float that banner.

I believe you're mistaken. GDPR allows you to record IP addresses for normal operation of a site, which specifically includes logging. No banner is required.

GDPR is not "vague" about this; perhaps you haven't read it (as laws go, it's pretty easy to read).

@adammartinetti : maybe you could consider developing a new product where you display a GDPR consent banner once, and then these settings apply to all Cloudflare-proxied websites (by passing this consent information as an additional header to the proxied site)
Sounds inferior to the "no cookies no banner" solution.

The GDPR does not mandate gratuitous and pointless personalised spying, which is the only case that requires consent. Normal operations (say a shop collecting payment details and shipping address to fulfil an order) do not require a consent banner.

Those can at least be blocked with ad blockers and/or disabling JS.
ReCAPTCHA was designed with this in mind: given that we had the need to distinguish humans from bots, it presents problems that are hard for bots to solve, where the resulting output is valuable. So the time consumed isn't wasted.
It's wasted from the perspective of the end user.
Not when the end user turns around and uses Google Maps which is now populated with higher quality fine feature information due to the training of the machine learning system on what traffic controls look like.
Valuable to whom?
I dont get your second point. Two things can be harmful to the open web at once. CloudFlare is definitely not taking the right approach at it, which damages the open web alongside botnets. Also, botnet owners are for some reason extraordinarily nerdy and smart so they probably will find a way to fool CF every other month. Its a cat and mouse game for them while actively harming everyone else both with their botnets and the increased aggressiveness of CF caused by their incorrect solution
>You know what else is harmful to the concept of the open internet? The enormous malicious botnets and other endemic problems that require a solution like CloudFlare.

You know what's infinitely worse? Monopolies.

Half these problems can be fixed by banning certain parts of the world. It's just politically shifted out of the Overton window to do that so CF profits greatly.

For every one user that makes their way on here and finds and posts here on this thread probably represent 1,000,000 plus normal users

An open web is open for everyone/thing not just classes of beings you select. Bots and users can both be malicious and both can be positive.

I agree with the premise that most people don't know how to identity or visibly complain about a given technical problem, and so an HN thread with N anecdotes about the problem likely corresponds to N * F actual amount of real-world incidents, for some value of F > 1.... but claiming it's a factor of a million without any backing evidence is absolutely an overreach.

> An open web is open for everyone/thing not just classes of beings you select. Bots and users can both be malicious and both can be positive.

This I agree with. I run an archiver ~monthly on a subset of my month's browsing history, and I'd hate if that got me blacklisted from Cloudflare-backed sites for a benign purpose. (See also the idea of remote attestation)

That's a pretty good idea. Do you randomly sample, or just exclude some domains? Is there some tool out there that does it for you?
Assembling the list of links to archive is a manual process--I just log them in an Obsidian notebook with a category and summary, and I later post it to my blog. (I don't really think other people care, it's more for me to be able to find past things I've found interesting.)

For the archival process I use ArchiveBox[1] running as a container on my NAS; I just grep through the note for `http|https` and feed the resulting list to the archiver. For everything not-hackernews I set the depth to 1, but for HN threads I do 2 so I grab whatever people may have linked in the comments.

I think there's ways to hook into like, ALL Firefox history or saved posts on reddit, but that's way heavier than what I care for.

[1]: https://archivebox.io/

Interesting! Firefox history is just SQLite. I might do something like, take all non-search URLs and archive them once a month or so. Thanks for the inspiration.
cloudflare blocks me every time I open an incognito window. No VPN, just having no cookie towards a domain automatically means I'm a bot…