|
|
|
|
|
by gumballindie
894 days ago
|
|
> But it was on purpose not trained on the big “web crawled” datasets to not learn how to build bombs etc, or be naughty. It wasn't trained on web crawled data to make it less obvious that microsoft steals property and personal data to monetise it. |
|
The question is - if we train a model on synthetic data generated by GPT-4 which has copyright issues, what is the status of this model? Will MS have to delete it as well? And all models trained with GPT-4 data?