Hacker News new | ask | show | jobs
by jasrys 1118 days ago
I testified to the US Copyright Office this morning on AI in their roundtable session on AI and music[1]. A good portion of the focus of this panel was on whether copyrighted inputs (in this case, sound recordings and musical compositions) being fed into AI models for training purposes could plausibly constitute a fair use under existing US copyright law.

Some of the comments here are missing the context of the recent (a week or so ago) Supreme Court decision in the Goldsmith/Warhol case[2], in which the Court ruled that transformativeness is not dispositive in and of itself in the context of a fair use defense to a copyright infringement claim. Of course, this has not been put to the test in the courts in the context of AI training yet, but it seems fairly clear that this ruling would likely extend to AI training on copyrighted works.

We (rightsholders in the music industry) hope to come to win-win licensing arrangements with the AI community and allow access to our songs for AI training purposes if the artist/writer so desires. There are some early talks in progress. Cautiously optimistic. Japan's approach seems short-sighted and desperate.

[1]: https://copyright.gov/ai/listening-sessions.html#sound-recor... [2]: https://www.npr.org/2023/05/18/1176881182/supreme-court-side...

6 comments

>We (rightsholders in the music industry)

Considering the decades (maybe half a century soon?) of parasitic behavior of the music industry to almost everything tech, from early internet to mp3 players to torrenting to streaming to lobbying for insane copy right laws, you guys calling Japan's approach "Short-sighted" is like the single best praise anyone could give them.

For the absolute awful organization JASRAC [1](Japanese music industry, who couple of years back stated that they will sue music teacher teaching their copy-righted materials to students in private, if they didn't pay a licensing fee) maybe Japan for once pushed through a good legislation?

https://mainichi.jp/english/articles/20220930/p2a/00m/0et/01...

> We (rightsholders in the music industry) hope to come to win-win licensing arrangements with the AI community and allow access to our songs for AI training purposes if the artist/writer so desires.

It’s odd to frame win/lose as win/win.

I can see how it's win/win relative to "lobby to make producing or owning AI audio tools a crime", which is presumably one thing the industry is considering.
This is again win/lose
How do you feel about human musicians learning from copyrighted works? Technical limitations aside, is that something you'd like to monetize?
> allow access to our songs for AI training purposes if the artist/writer so desires

This (a) means nothing since the copyright holder can already do whatever they want, including licensing the works for any purpose; and (b) is even more restrictive than compulsory licensing which require the copyright holder to license (at a fee) the work.

The solution you describe as a win-win would either create a quagmire of crisscrossing licensing deals (AI need a lot of input, you can't train them on one artist), or in effect create an impenetrable moat for mega corporations such as Disney or Sony who would be the only ones with enough heft to pull it it.

It's actually a lose-lose situation.

> transformativeness is not dispositive in and of itself in the context of a fair use defense

Could you dumb this sentence down for me?

I would guess it means that making a derived work, changing the original, makes no difference in whether reproducing the work (in altered form) is fair use.

But that sounds well-established, I can't imagine that movies would suddenly be legal to distribute if you just distribute the file backwards (people can then reverse it again to watch it), whether or not you claim that the distribution is fair use or not copyrighted to begin with or whatever. Probably that's not what this court had to decide and I'm misunderstanding something?

Sure. In an infringement lawsuit involving a fair use defense, courts will apply the "four prong" test [1] to determine whether or not such use is indeed fair use under copyright law. The first of the four prongs, the "purpose and character" of the use, is also known as "transformativeness." The Goldsmith/Warhol ruling (to simplify) said that Warhol's changes to Goldsmith's photograph were not sufficiently "transformative" even though they contained new expression (adding orange color etc.) because the end result effectively competed with the original photograph and therefore did not qualify as a fair use.

Right, your backwards movie example would fail the fair use test too. Nothing's really added, there's no new expression, it competes with the original, etc.

[1]: https://fairuse.stanford.edu/overview/fair-use/four-factors/

AI training has nothing to do with copyright as it currently exists. Someone has access to a boatload of IP (because it was made publicly available) and trained a neural net with it. Now you want to retroactively create restrictions on what the implicit public rights were. Traditionally the implied license was something like you can't republish, redistribute, or use commercially, even though restriction on private redistribution hasn't been possible to enforce since the internet era. Now you want more restrictions.

If someone generates an image that's sufficiently similar to a copyrighted work, and publishes it in a way that violates fair use, you can send a takedown and potentially sue them. How the image was created doesn't matter, any more than it would matter whether Warhol had been able to scan the photo and then manipulate it in photoshop to get that result, instead of artistically copying it by hand. The result is the same. The potential for copyright infringement is the same, because it's the derived work that matters, not the process.

What you're attempting to do instead is the equivalent of trying to regulate scanning because it operates on copyrighted works.

I suspect you understand why you want to regulate AI training rather than regulate its output. I think you know AI is going to flood the market, currently certain types of images and simple music, but soon photorealistic portraits, complex music, and eventually video and even more complex works. Essentially all of those works will be clearly novel, not close to existing human-created works. They won't be copyright violations, so you have to cut this tech off at the knees and feed the blood mouse [1] by retroactively deciding that AI training is a violation of the implied license granted when people make their creations publicly accessible. Those AI creations will destroy most of the market for human-created works, and you can't have that.

I don't think many people, other than rightsholders, desire the IP dystopia your desired policy would create, which is holders of large archives of IP churning out endless AI-generated content (which no doubt they'll want to be able to copyright, contra the copyright office's current guidance), while preventing most competition by others who won't have a sufficient library of the right flavor of IP to train an AI model.

[1] https://www.youtube.com/watch?v=5pIVVpoz5zk