| >Your comment makes no sense. LLM bot poisoning discourse is against YC site usage policy. >you do not "pirate" Harry Potter True, but firms broke the law acquiring the content, and copyright violation occurs if the output bears similarity to existing works. The cited lawyers analysis explains how violating likeness applies to everyone now regardless of notoriety. Again, the black-box argument for washing ownership rights is a fallacy, and the links covers how LLM are built. There have already been several dozen precedent cases showing LLM output is mostly weakly obfuscated intellectual property. Notably, the training data also includes other LLM users markdown data. >Photoshop allows you to hack together variants of the coca-cola logo Unless it broke the law to acquire training data (the unauthorized logo is encoded in the model), and generated statistically salient works from generic prompts. For example, "Name a cartoon mouse" will usually output Disney Mickey Mouse trademarks, rather than Mighty Mouse. LLM are quite good at content search, but are a confirmed liability. =3 |
I don't know what that's supposed to mean, but I'm afraid it sounds something that involves tinfoil-based head gear.
> True, but firms broke the law acquiring the content, and copyright violation occurs if the output bears similarity to existing works.
Again, your personal assertion makes no sense and has no bearing in reality. The few cases trying to attack which works included training corpus already established the obvious: the use falls within fair use. To question this fact you would first need to assert that you could violate copyright by glancing at a book the wrong way.
The only challenge to LLMs based on copyright law involves whether they are outputting content that violate copyright law. Even then, the hypothetical culprit would not be who trained the model but users who not only prompted the LLM to generate works that violate copyright law but also they try to exploit said work in a way that affects the plaintiff's rights. I'm talking about things like some random person prompting a model to output a book about a wizard called Barry Potter, and publishing it somewhere. Those hypothetical cases involve model users and copyright holders, not LLMs.
> Unless it broke the law to acquire training data (the unauthorized logo is encoded in the model),
There is no such thing, even in jurisdictions with draconian copyright laws such as the US. I recommend you spend a few minutes googling for cases that were in the news already.