Hacker News new | ask | show | jobs
by CableNinja 1040 days ago
Respecting robots.txt and setting UA are two different things. And yes, i know UA can be set to anything, however, the UA has been mentioned, and it shouldnt change drastically, under normal circumstances by a lot of these scrapers.

Respecting the robots.txt has nothing to do with what the UA is set to. Yes, you can say this UA can do x in the robots.txt, but not respecting it, makes it moot.

The method i put in place does not use robots.txt, so theres no need to worry about them not respecting it anymore.

As someone else mentioned, like the world of spam and such, its an arms race. The solution may not be perfect, but its functional