They say they removed pages, not websites. Having false positives isn't a problem when you're still left with 750GB of data—quality matters more than slightly higher quantity at that point.
Sorry, I was thinking about pages even though I said websites. Native language interference (typically, we use the same term for pages and websites in my language).
Anyway, my point is not a matter of quantity. The way they're doing it, they have 750 GB of data, but they have exactly zero data that talks about bastards, fecal transplants, etc. So they may have a hard time answering questions about those specific subjects.
Anyway, my point is not a matter of quantity. The way they're doing it, they have 750 GB of data, but they have exactly zero data that talks about bastards, fecal transplants, etc. So they may have a hard time answering questions about those specific subjects.