Hacker News new | ask | show | jobs
by davidascher 4614 days ago
Crawling is certainly a complementary data collection strategy, but it's harder to avoid IP-based "filter bubble" effects w/out also deploying something akin to a bot. The hope is that by using real people using real browsers we'll collect data that reflects actual-behavior-in-the-wild.

You're right that poisoning is a potential problem if/when the data ends up useful enough to warrant poisoning.