| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by davidascher 4661 days ago
	Crawling is certainly a complementary data collection strategy, but it's harder to avoid IP-based "filter bubble" effects w/out also deploying something akin to a bot. The hope is that by using real people using real browsers we'll collect data that reflects actual-behavior-in-the-wild. You're right that poisoning is a potential problem if/when the data ends up useful enough to warrant poisoning.