Hacker News new | ask | show | jobs
by andy99 21 days ago
Captchas are primarily to punish users for not allowing tracking, or using the “right” services, they may prevent some bots as a side effect (or a pretence from the provider) but it’s mostly for google and cloudflare to abuse their monopolies.
3 comments

Google I would say yes, but what does Cloudflare gain? They don't run an ad network. Generally I'd say Cloudflare is pretty good to have as a guardian of the web compared to other options.

They protect free speech and allow Tor users. Ever tried completing a reCaptcha on Tor?

Cloudflare gains things like this:

https://blog.cloudflare.com/introducing-pay-per-crawl/

https://developers.cloudflare.com/browser-run/quick-actions/...

They create a new problem and sell the solution.

Nowadays, somebody can just ask claude to build them a scraper/bot that hooks into a proxy network and all of a sudden they can easily send 20k+ reqs/min from hundreds or thousands of IPs cycling them as they get rate limited or banned. In my work, the scrapers have gotten way more aggressive in the last 2 years or so. Frankly, I'm happy there is a solution.

There may be things to criticize Cloudflare for, but the problem of bots and scrapers destroying the open web was getting worse no matter what.

God damn it.
Tin hat folk say Cloudflare is CIA. I dunno
I can relate to the cynicism, but it's also a general tool in the effort to combat bot abuse on public facing post forms that are trying to do something for real people. Many everyday devs reach for tools like this because of the deluge of garbage they get in its absence.

My take is that it's a very hard problem, so hard that even captchas by the biggest internet company can't get it right. I strongly hesitate to roll my own bot friction strategy when other tools are available. But I recognize I may have a lack of imagination here, would absolutely love to hear alternate ideas especially for small projects that may not need the heft of corporate captchas.

We use captchas to cut down on bots and crawlers. They don't work as well as they used to but they at least alter the economics somewhat, or so I tell myself.

Our reason for this is to try to make HN as good as possible for its real users.

I’ve never encountered a captcha on HN, do you guys use less aggressive settings?

The reason captchas bother me so much is they always seem to happen in the course of legitimate activities. Like I had one when trying to make a charity donation, or ordering something - I have no idea why it would be hard to distinguish such traffic as legitimate, I’m convinced it’s because I’m using a nonstandard browser, not allowing cookies, etc.

If I was trying an automation or to bulk download something or whatever, I’d take the captcha as an interesting professional challenge. When I’m trying to use someone’s services or pay them money, it’s just ridiculous friction and I generally abandon any transaction that makes me do a captcha.

Incidentally I have scraped HN and never encountered any problems, since you have an api for it

Yeah, the only problem I've ever had on accessing HN was banned IP addresses. Never seen a captcha.
It mostly kicks in on new accounts.