Hacker News new | ask | show | jobs
by LaundroMat 2014 days ago
Trust me, that's not the first thing you think about when designing your scraper.

Typically, one doesn't care whether the same page has been visited before. What one does care about is avoiding storing duplicate data.

1 comments

> a basic feature would be deduplication