Hacker News new | ask | show | jobs
by brianjking 1030 days ago
I mean, they did write their own crawler and have huge financial incentives to respect it.

What isn't known is if those sites will still have their content possibly included in training corpuses from CommonCrawl or ThePile, etc.