Hacker News new | ask | show | jobs
by jwakely 126 days ago
It's a valid concern, and one that was raised on reddit a few times too.

But if you're building an open and fair model, I hope you're not just sucking up the entire web and training it on endless stolen data, DoS'ing open source projects constantly. If you just send out crawlers to consume everything, expect some poison. So maybe don't build models that way.