Hacker News new | ask | show | jobs
by goertzen 897 days ago
No they are not.

This is a negotiation tactic by the NYT to drive up the licensing price. Period.

The Napster/Music Industry analogy has no resemblance to this situation.

The only meaningful question that might be answered as a result of this is, what permission and access rights do crawlers have to content that is publicly and legally available.

2 comments

Surely there's a meaningful question about copying and distributing content verbatim, which GPT has been shown to do.
Not really. Models are a device capable of producing protected content given some input contortions. So are Xerox machines.
If I Xerox'd a book and sold copies to people I'm clearly violating copyright. I'm not sure I follow.
Nobody has given Xerox an injunction against researching or building copiers because you can copy books and sell them.
Right. If a publisher found a specific Xerox machine was being used to copy and commercially distribute a book, in violation of copyright, they'd ask for an injunction on the person doing that. With OpenAI, the NY Time can see their copyrighted material on both the input (training) side and distributed output (generated) side of a specific LLM implementation. So they cry foul on OpenAIs actions, not LLM in general.

There appears to be an open question about if the LLM can freely ingest copyrighted material and output it verbatim without violating copyright. That seems like an obvious "no" to me, unless we decide that LLM has special treatment.

Also the use of the content as per provision on the web.

NYT is paywalled - you have to agree to a license to access it, there are exclusions in that agreement that I don't understand but I think may be important in this discussion!

The article does not mention napster, where did this reference come from?