| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by dahart 836 days ago

I don’t know about this one, but in general it seems like some of the legal issues coming at AI deployments are quite legitimate. Some AI datasets are definitely violating existing copyright law. We can certainly debate whether the laws should change, and there are valid points on both sides. But to the question of whether some are breaking current laws, the answer is clearly yes.

Obviously the answer here is for companies producing AI to curate, obtain, and/or pay for fully legal training data. The problems have been that gathering and using copyrighted data is very easy, and AI is extremely data hungry (some experts theorize massive data alone is responsible for AI success, and algorithms are secondary at best), and there’s at least the perception if not the reality of a high stakes winner-takes-all race to produce the best AI.

To me this feels a little like the situation tech companies have put themselves in with automated support and no way to reach humans, in that they couldn’t have scaled like they did and gotten there without dropping hands-on support on the floor, but they’ve created a time bomb that is beginning to backfire in more and more serious ways.