| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by harshreality 1118 days ago

AI training has nothing to do with copyright as it currently exists. Someone has access to a boatload of IP (because it was made publicly available) and trained a neural net with it. Now you want to retroactively create restrictions on what the implicit public rights were. Traditionally the implied license was something like you can't republish, redistribute, or use commercially, even though restriction on private redistribution hasn't been possible to enforce since the internet era. Now you want more restrictions.

If someone generates an image that's sufficiently similar to a copyrighted work, and publishes it in a way that violates fair use, you can send a takedown and potentially sue them. How the image was created doesn't matter, any more than it would matter whether Warhol had been able to scan the photo and then manipulate it in photoshop to get that result, instead of artistically copying it by hand. The result is the same. The potential for copyright infringement is the same, because it's the derived work that matters, not the process.

What you're attempting to do instead is the equivalent of trying to regulate scanning because it operates on copyrighted works.

I suspect you understand why you want to regulate AI training rather than regulate its output. I think you know AI is going to flood the market, currently certain types of images and simple music, but soon photorealistic portraits, complex music, and eventually video and even more complex works. Essentially all of those works will be clearly novel, not close to existing human-created works. They won't be copyright violations, so you have to cut this tech off at the knees and feed the blood mouse [1] by retroactively deciding that AI training is a violation of the implied license granted when people make their creations publicly accessible. Those AI creations will destroy most of the market for human-created works, and you can't have that.

I don't think many people, other than rightsholders, desire the IP dystopia your desired policy would create, which is holders of large archives of IP churning out endless AI-generated content (which no doubt they'll want to be able to copyright, contra the copyright office's current guidance), while preventing most competition by others who won't have a sufficient library of the right flavor of IP to train an AI model.

[1] https://www.youtube.com/watch?v=5pIVVpoz5zk