Hacker News new | ask | show | jobs
by mgrandl 384 days ago
What does proof-of-work mean here and what makes it easy for humans and hard for bots?
4 comments

Think of crawlers: a crawler typically makes hundreds or thousands of requests per second. The owners of the crawler then sell this data for X$, or gain X$ profit.

Proof of work adds a very small cost to each individual request, increasing the cost of crawling to a number higher than X. Because actual humans make very few requests, we don’t notice the increase in cost.

This exactly, having ran very large scraping operations, it only takes a slight increase in cost to make it unprofitable for many use cases.
Right, scale is solved… but not at all targeted “attacks”.

If some site uses this and I only want that site as an attacker or as a personal scraper or etc, this is keenly ineffective at proving human vs bot.

When you use a captcha, you presumably want to defeat someone curling your CreatePost endpoint, not just make it more annoying to do it at only botnet scale.

This captcha still lets all traffic through. Except now you waste the battery of honest users.

Even HN proponents of the idea don't use it on their own sites.

I rather see something like anubis than some unsolveable captcha. I never understood the battery-argument, I recon my screen uses more energy during pow-solving than it takes my phone to solve these pows.
> I rather see something like anubis than some unsolveable captcha.

So would bad actors. Which is why everyone uses normal captchas and not mere PoW.

PoW is the easiest captcha to beat.

[citation needed]
For which part?

Every time a new submission is created on HN, you have a curl script that posts a comment on it shilling your product. (According to the /newest tab there seems to be one submission every few minutes.)

What's harder for you to automate: the comment always posts successfully after 500ms, or you get a Cloudflare Turnstile captcha every time?

It's equally easy for both. But people using broswers only do it a few times, while bots need to do it many times. A second for a human every X pages is not much, but it's a death-knell for the general practice of bots (and they can't store the cookies because you can rate-limit them that way).

Imagine scrapping thousands of page, but with a X>1 second wait for each. There wouldn't be a need to use such solution if crawlers were rate-limiting themselves, but they don't.

So is the solution to stymying bots to just add a page load delay of a second or two? Enough that people won't care, but it doesn't scale for bots?
Just adding a delay wouldn't achieve anything because bots can just do something else while they wait, whereas PoW requires them to actively spend their finite resources before they can continue doing whatever they want to do.
So if you rate limited to one request per second, then use 100 cookies to make 100 requests per second, 1 request per second per cookie.
I think it's only more expensive for bots, though just as easy for bots.

The problem with bots is they quite often farm this out to stolen resources. It makes sending whatever they are sending slower, but doesn't stop it.

It will make server hijacking more noticeable and harder to hide.
ahh, that makes sense, thanks

I do think that calling this a CAPTCHA when it's not actually intended to distinguish humans from computers is a bit misleading, but I can see why you would do that