| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by geocar 1238 days ago

> ChatGPT/LLMs can essentially crawl _anything_ they want, regardless of legality, license, consent, etc. These models are trained on anything that can be ingested. Once trained, you can release the model with plausible deniability

We will see. The idea that ML models contain the mere creative essence and are generative from something that cannot be copyrighted is not one that has been tested in court.

I personally am not convinced: My own experiments with prompt-stuffing GPT definitely seem to reveal corpus.

I am reminded of a story of how billg would type a command into basic computers at trade shows to "reveal" that it contained microsoft-copyrighted code (gotcha!).

I imagine if someone did that in front of a judge it would be game-over.