Hacker News new | ask | show | jobs
by yjftsjthsd-h 102 days ago
> On the broader point, I hear you, but I think there's a middle ground. Not all content is public knowledge. Some of it is premium, proprietary, or behind a paywall. The people publishing it should get to decide whether it becomes free training data.

I don't follow. Are you suggesting that someone is scraping private sites that they have to log in on in order to train AI on it?