Hacker News new | ask | show | jobs
by ErikAugust 2500 days ago
Just crawl, and then disregard sites that have certain scripts present.
1 comments

Yeah, technically, it'd be very easy. I've written countless spiders, crawlers, etc.

Basically, only index items where webpage.indexOf('google') < 0