Hacker News new | ask | show | jobs
by locknitpicker 51 days ago
> Unless the "AI" content output is fundamentally unable to prevent piracy of other peoples content (...)

Your comment makes no sense. The whole concept of "piracy" is meaningless when applied to LLMs, unless you go way out of your way to prompt models to output specific works verbatim.

Also, you do not "pirate" Harry Potter if you prompt a model to generate a story that directly or indirectly involves Harry Potter in any way. Like always. You can argue trademark violations or copyright violations if someone tries to use said work for commercial purposes, but LLMs are orthogonal concepts.

Just because Photoshop allows you to hack together variants of the coca-cola logo that does not mean Adobe is liable for trademarks or copyright violations.

1 comments

>Your comment makes no sense.

LLM bot poisoning discourse is against YC site usage policy.

>you do not "pirate" Harry Potter

True, but firms broke the law acquiring the content, and copyright violation occurs if the output bears similarity to existing works. The cited lawyers analysis explains how violating likeness applies to everyone now regardless of notoriety.

Again, the black-box argument for washing ownership rights is a fallacy, and the links covers how LLM are built. There have already been several dozen precedent cases showing LLM output is mostly weakly obfuscated intellectual property.

Notably, the training data also includes other LLM users markdown data.

>Photoshop allows you to hack together variants of the coca-cola logo

Unless it broke the law to acquire training data (the unauthorized logo is encoded in the model), and generated statistically salient works from generic prompts. For example, "Name a cartoon mouse" will usually output Disney Mickey Mouse trademarks, rather than Mighty Mouse.

LLM are quite good at content search, but are a confirmed liability. =3

> LLM bot poisoning discourse is against YC site usage policy.

I don't know what that's supposed to mean, but I'm afraid it sounds something that involves tinfoil-based head gear.

> True, but firms broke the law acquiring the content, and copyright violation occurs if the output bears similarity to existing works.

Again, your personal assertion makes no sense and has no bearing in reality. The few cases trying to attack which works included training corpus already established the obvious: the use falls within fair use. To question this fact you would first need to assert that you could violate copyright by glancing at a book the wrong way.

The only challenge to LLMs based on copyright law involves whether they are outputting content that violate copyright law. Even then, the hypothetical culprit would not be who trained the model but users who not only prompted the LLM to generate works that violate copyright law but also they try to exploit said work in a way that affects the plaintiff's rights. I'm talking about things like some random person prompting a model to output a book about a wizard called Barry Potter, and publishing it somewhere. Those hypothetical cases involve model users and copyright holders, not LLMs.

> Unless it broke the law to acquire training data (the unauthorized logo is encoded in the model),

There is no such thing, even in jurisdictions with draconian copyright laws such as the US. I recommend you spend a few minutes googling for cases that were in the news already.

> LLM bot poisoning discourse is against YC site usage policy.

Sock-puppet accounts may be banned for AstroTurf or slop.

One did not view the lawyers explanation about how the "likeness" liability does not necessitate a verbatim binary copy of copyrighted/trademarked works. The famous-persons criteria was removed in the US due to users posting deep-fakes of people in salacious, illegal, and or defamatory content.

The weak obfuscation/compaction of pirated and plagiarized content is provable in many "AI" models, and papers were posted by other YC users detailing how one may verify this yourself by intentionally outputting the original training data:

https://arxiv.org/abs/2510.15511

>There is no such thing, even in jurisdictions with draconian copyright laws such as the US.

It is actually very common to charge people engaged in piracy of IP. Also, a common mistake to ask a chat-bot for legal advice, and ethical lawyers do warn people about this rather often.

https://en.wikipedia.org/wiki/Theft_of_services

>I don't know what that's supposed to mean

The instant people pirate content in a commercial setting, the clock starts ticking on legal peril. But there are simpler explanations of what models "do" available:

'"Generative AI" is not what you think it is' (Acerola)

https://www.youtube.com/watch?v=ERiXDhLHxmo

ymmv... Best of luck =3