Yes! Training and generation are fair use. You are free to train and generate whatever you want in your basement for whatever purpose you see fit. Build a music collection, go ham.
If the output from said model uses the voice of another person, for example, we already have a legal framework in place for determining if it is infringing on their rights, independent of AI.
You may be reaching the limits of the metaphor here, but restaurants are absolutely responsible for the e coli if it's found in significant quantities whether it's in the initial ingredients or the cooked end product. A restaurant is required to vet its suppliers and ensure food safety protocols throughout the entire process with several independent checks at many points, and is ultimately directly responsible if a customer sues. A restaurant does not get to cook bad ingredients well and then point at the supplier. They will find themselves shut down immediately, andpermanently if they do not resolve the situation.
In this context, this would be the equivalent of Suno explicitly placing stop points throughout the training, tokenization, and generation processes to verify that there was absolutely no chance of it generating copyrighted material through some kind of clean room reconstruction test. They would also need those tests to be audited at random by a third party governing body. Obviously they are not doing this, so the metaphor definitely does not track here.
They charge you by the amount of music you get from them. That's selling music. Selling a tool would be if they charge you once, you download the tool, and you can use it on your computer to generate as much music as you want to pay electricity for.
Sure, but if you are just essentially making a copyright infringement tool, and then selling it to people so they can use it to infringe, and then they go and use it to infringe, you're a contributory infringer. Not saying this is exactly what Suno is doing, but just pointing out that you can be an infringer without "selling songs to consumers"
When you use a DAW to recreate a favorite song for learning, should the DAW show a warning that you’re infringing on a copyrighted melody? Should it let you make it? Export it? You promise the DAW it’s for personal use? It’s only a matter of time until this stuff is in DAWs.
When a general computer using agent recreates songs in Logic Pro in high fidelity, then what?
It’s called Fair Use for a reason – we let humans Use things generally and ask them to be Fair.
Or we can go in the direction of movies and TV where screenshots of protected content show up blank on my iPhone. Just in case someone wanted to, god forbid, clip the show.
I don't think anyone could reasonably characterize a DAW as a tool designed to infringe copyrights with so I don't think there is an issue. The fact that none of the labels have ever sued DAWs for this reason should be an intuition for you on this matter.
>It’s called Fair Use for a reason – we let humans Use things generally and ask them to be Fair.
So exhausted with people who come to these threads and try to discuss legal issues by only paying lip service to the words and not their meanings, let alone the actual law that they seem to want to debate. Then they go even further and turn it into some grand political statement, or hypothesize why copyright shouldn't exist at all. But there is absolutely no jurisprudence that would indicate a DAW is the kind of tool I described. I understand you came up with an argument in your head why it could be, but I'm letting you know that in the law, it's not what would be considered a reasonable argument and it would go nowhere.
DAWs are tools made to create music, generally. They do not contain banks of copyrighted materials to which the user ultimately pulls the copying "trigger" (that's the system I described).
What does "fair use" even mean in a world where models can memorise and remix every book and song ever written? Are we erasing ownership?
The problem is, copyright law wasn't written for machines. It was written for humans who create things.
In the case of songs (or books, paintings, etc), only humans and companies can legally own copyright, a machine can't. If an AI-powered tool generates a song, there’s no author in the legal sense, unless the person using the tool claims authorship by saying they operated the tool.
So we're stuck in a grey zone: the input is human, the output is AI generated, and the law doesn't know what to do with that.
For me the real debate is: Do we need new rules for non-human creation?
why are you saying "memorize"? are people training AIs to regurgitate exact copies? if so, that's just copying. if they return something that is not a literal copy of the whole work, then there is established caselaw about how much is permitted. some clearly is, but not entire works.
when you buy a book, you are not acceding to a license to only ever read it with human eyes, forbearing to memorize it, never to quote it, never to be inspired by it.
> Specifically, the paper estimates that Llama 3.1 70B has memorized 42 percent of the first Harry Potter book well enough to reproduce 50-token excerpts at least half the time. (I’ll unpack how this was measured in the next section.)
> Interestingly, Llama 1 65B, a similar-sized model released in February 2023, had memorized only 4.4 percent of Harry Potter and the Sorcerer's Stone. This suggests that despite the potential legal liability, Meta did not do much to prevent memorization as it trained Llama 3. At least for this book, the problem got much worse between Llama 1 and Llama 3.
> Harry Potter and the Sorcerer's Stone was one of dozens of books tested by the researchers. They found that Llama 3.1 70B was far more likely to reproduce popular books—such as The Hobbit and George Orwell’s 1984—than obscure ones. And for most books, Llama 3.1 70B memorized more than any of the other models.
You are comparing AI to humans, but they're not the same. Humans don't memorise millions of copyrighted work and spit out similar content. AI does that.
Memorising isn't wrong but when machines memorise at scale and the people behind the original work get nothing, it raises big ethical questions.
The wast majority of piracy are not literal copies. Movies and music get constantly transformed into different sizes and scales, with the majority using lossy transformations that changes the work. A movie taken as raw format and transformed into 144p has far less than 1% of the original work, and is barely recognizable. Copyright law seems to recognize that as infringement.
Most AI seems much better at reproducing a semi-identical copies of an original work than existing video/audio encoders.
If, as a human artist, I decide to train myself on the discography of a famous artist, then produce songs in his style and sell them for cheap so that others don't have to pay for the original artist, then I am sure it is fair use. It is done all the time.
Now, what if instead of training myself using real instruments, I train my AI and do the same. Is it different?
It is complicated, but there are many arguments in favor of fair use, probably more than they are against but as you say, let's the courts decide.
But in any case, piracy is illegal in every case. As a human, it is illegal for me to use pirate copies, whether it is for training myself as a musician, for training my AI, or for simply listening.
Well I've been able to get Suno to do Beatles covers. It only works maybe 1/20 times, but you can do it. It's not an exact replica either, but you can get the same chords and melodies as the original.
Well there was that legal company who trained an LLM on their oppositions legal documents and then generated their own. I dont think inputs or outputs were ruled legal in that regard.
But as long as the model isnt outputting infringing works theres not really any issue there either.
Not sure we can infer that (or anything) about Suno from this ruling. The judge here said that Anthropic's usage was extremely transformative. Would Suno's also be considered that way?
Anthropic doesn't take books and use them to train a model that is intended to generate new books. (Perhaps it could do that, to some extent, but that's no its [sole] purpose.)
But Suno would be taking music to train a model in order to generate new music. Is that transformative enough? We don't know what a judge thinks, at least not yet.
Only if the physical albums don't have copy protection, otherwise you're circumenventing it and that's illegal. Or is it, against the right to private copy? If anything, AI at least shows that all of the existing copyright laws are utter bullshit made to make Disney happy.
Do keep in mind though: this is only for the wealthy. They're still going to send the Pinkertons at your house if you dare copy a Blu-ray.
If you read the ruling, training was considered fair use in part because Claude is not a book generation tool. Hence it was deemed transformative. Definitely not what Suno and Udio are doing.
If the output from said model uses the voice of another person, for example, we already have a legal framework in place for determining if it is infringing on their rights, independent of AI.
Courts have heard cases of individual artists copying melodies, because melodies themselves are copyrightable: https://www.hypebot.com/hypebot/2020/02/every-possible-melod...
Copyright law is a lot more nuanced than anyone seems to have the attention span for.