Hacker News new | ask | show | jobs
by a_wild_dandan 1072 days ago
I suspect the opposite outcome also being plausible: the LLM is viewed analogously to a blog author. The blogger/LLM may consume a book, subsequently produce "derived" output (generated text), and thus generate revenue for the blogger/LLM's employer. Consequently, the blogger/LLM's output -- while "derived" in some sense -- differs enough to be considered original work, rather than "derivative work" (like a book's film adaptation). Auditing how the blogger/LLM consumed relevant material is thus absurd.

Of course, this line of reasoning hinges on the legitimacy of an "LLM agent <-> blogger agent" type of analogy. I suspect the equivalence will become more natural as these AI agents continue to rapidly gain human-like qualities. How acceptable that perspective would be now, I have no idea.

In contrast, if the output of a blogger is legally distinct from an AI's, the consequences quickly become painful.

* A contract agency hires Anne to practice play recitals verbally with a client. Does the agency/Anne owe royalties for the material they choose? What if the agency was duped, and Anne used -- or was -- a private AI which did everything?

* How does a court determine if a black box AI contains royalty-requiring training material? Even if the primary sources of an AI's training were recorded and kosher, a sufficiently large collection of small quotes could be reconstructed into an author's story.

* What about AIs which inherit (weights, or training data generated) from other AIs of unknown training provenance? Or which were earlier trained on some materials with licenses that later changed? Or AIs that recursively trained their successors using copyrighted works which it AI reconstructed from legal sources? When do AIs become infected with illegal data?

The business of regulating learning differently depending on whether the agent uses neurons or transistors seems...fraught. Perhaps there's a robust solution for policing knowledge w.r.t silicon agents. If you have an idea, please share!