|
|
|
|
|
by ghaff
545 days ago
|
|
I'm pretty sure any competent lawyer would stipulate that, in many/most cases, training is happening on copyrighted information. I'm also pretty sure that OpenAI is not arguing that all their training data is either licensed or they own the copyrights to. (Some companies, perhaps Adobe?, have been more conservative.) Perhaps I'm wrong. But I haven't heard that argument publicly and I would need to be convinced. |
|
Training on CNN and Netflix content = i sleep
Training on private personal and corporate inboxes, medical records, and illegal content, purchased from blackhat data brokers = real shit
A Kenyan data labeler famously cut ties with Openai after Openai asked them to gather CSAM content.