Hacker News new | ask | show | jobs
by philwelch 532 days ago
> I'm pretty sure if an LLM creates Paul's Boutique 2.0 in 2025 using incredible number of samples, then someone cannot sell it (or use it in a YouTube video) without first licensing those samples. I doubt very much someone could just "hide behind" an LLM and claim, "Oh, it is original, but derivative, work, created by an LLM." I doubt courts would allow that.

This isn’t how LLM’s work though. Samples are just that, literal samples that are copied from one work to another verbatim. LLM’s use training data to construct a predictive model of which tokens follow each other. You probably could get an LLM to use samples deliberately if you wanted to, but this isn’t how they typically work.

Regardless, at that point you’re just evaluating the claim of copyright infringement based on the nature of the work itself, which is exactly what I’m advocating, versus presuming that all LLM output is necessarily copyright infringement if any copyrighted material was used in training.