Hacker News new | ask | show | jobs
by manigandham 3904 days ago
> If you don't want someone to access your page, then don't respond to their request

> there's no easy way to always reliably identify the requester

That's the problem: you can't identify the person to block them in the first place.

Robots.txt is actually an explicit signal of intention for reputable search engines but that's all we have today and is easily ignored and does not work with these scrapers or anyone else.

Not sure what your last sentence means.