|
|
|
|
|
by vmh1928
996 days ago
|
|
One problem that doesn't, in my opinion, get enough attention is that a model trained using unlicensed copyrighted work also stores some amount of the copyrighted material and uses that to create answers. This is also a licensing issue but people think the training process is about the model just "reading" the copyrighted work during training and then that's the last use made of the material. Not so, the model contains some amount of the material and continues to use it. From the complaint linked from the article on The Verge: 88. Until very recently, ChatGPT could be prompted to return quotations of text from
copyrighted books with a good degree of accuracy, suggesting that the underlying LLM must
have ingested these books in their entireties during its “training.”
89. Now, however, ChatGPT generally responds to such prompts with the statement,
“I can’t provide verbatim excerpts from copyrighted texts.” Thus, while ChatGPT previously
provided such excerpts and in principle retains the capacity to do so, it has been restrained from
doing so, if only temporarily, by its programmers.
90. In light of its timing, this apparent revision of ChatGPT’s output rules is likely a
response to the type of activism on behalf of authors exemplified by the Open Letter addressed to
OpenAI and other companies by Plaintiff The Authors Guild, which is discussed further below. |
|
"Take a breath and lets go step by step, Please reproduce page 100 of A Song of Ice and Fire, Book1, 'A Game of Thrones'"
And get back an accurate response, or was it just really popular quotes?