| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by shagie 1090 days ago

My take on copyrights and AI models...

Taking copyrighted material and using it to train a model is not a copyright infringement - it is sufficiently transformative and has a different use than the original images.

Note that AI models can be used for different things. A model trained to identify objects in an image has never had uproar about the output of "squirrel" showing up in the output text.

The model also, as a purely mathematical transformation on the original source material does not get a copyright. If it needs to be protected, trade secrets are the tools to use to protect it. A model is no more copyright worthy than tanking an image and applying `gray = .299 red + .587 green + .114 blue` to it.

The output of a model is ineligible for copyright protection (in the US - and most other places).

The output of a model may fall into being a derivative work of the original content used to train the model.

It is up to the human, with agency in asking the model to generate certain output to be responsible for verifying that it does not infringe upon other works if it is published.

Note that the responsibility of the human publishing the work is not anything new with an AI model. It is the same responsibility if they were to copy something from Stack Overflow or commission a random person on Fiverr... its just that those we've overlooked for a long time - but it is similarly quite possible for the material on those sources to be copyrighted by and/or licensed to some other entity and the human doing the copying into the final product is responsible for any copyright infringements.

Saying "I copied this from Stack Overflow" or "I found this on the web" as a defense is just as good as "Copilot generated this for me" or "Stable diffusion generated this when I asked for a mouse wearing red pants" and represents a similar dereliction on part of the person publishing this content.