Hacker News new | ask | show | jobs
by kordlessagain 639 days ago
There's a HTTP code for charging for access: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/402

Then there's a Lightning Network protocol for it: https://docs.lightning.engineering/the-lightning-network/l40...

With the Cloudflare stuff, it just seems like an excuse to sell Cloudflare services (and continue to force everyone to use it) as opposed to just figuring out a standard way of using what is already built to provide access for some type of micropayment.

1 comments

The problem is that soft technical measures like HTTP 402 and robots.txt aren't legally binding, so there's nothing stopping scrapers from just ignoring them. Cloudflares value proposition here is they will play the cat-and-mouse game of detecting things like spoofed user agents and residential proxies on your behalf, and actively block what appears to be scraper traffic unless they pay up.

Unfortunately this probably means even more CAPTCHAs for people using VPNs and other privacy measures as they ramp up the bot detection heuristics.

Sure it's not legally binding, but if I see >100000 requests coming from 1 IP address within a week, I'm also not legally bound to make that 402 error go away. By having an automated payment mechanism, the two parties could come to an agreement they're both happy about

> there's nothing stopping scrapers from just ignoring them

Feel free to ignore HTTP errors, but those pages don't contain the content you're looking for

(For the record, I don't use HTTP 402, but I noncommercially host stuff and know what bots people are complaining about.)

I mean it's not legally binding in the sense that if you start sending 402s or 403s to a scraper it can just take that as a signal to try again from a different IP address until it works - your servers clearly stated intent that the bot should pay up or go away isn't legally actionable. With enough effort you can chase the bots until they run out of resources, but few people have time to win that battle by themselves, hence delegating it to Cloudflare or similar.
"Unfortunately this probably means even more CAPTCHAs for people using VPNs and other privacy measures as they ramp up the bot detection heuristics"

Yeah. You can't have it both ways. Similar dilemma for requiring identification vs disallowing immigrants.