Hacker News new | ask | show | jobs
by twelve40 1036 days ago
This. Unfortunately, there is common crawl, there is bing and a million of other ways they could hide/get the data from. Or, just ignore robots.txt, it's not like it's a very honest or transparent operation they run there.