| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by visarga 937 days ago
	It was trained on "textbook quality" synthetic data + some high quality web data. The question is - if we train a model on synthetic data generated by GPT-4 which has copyright issues, what is the status of this model? Will MS have to delete it as well? And all models trained with GPT-4 data?

1 comments

> if we train a model on synthetic data generated by GPT-4 which has copyright issues

Is that the new directive from HQ? I see a lot of folks parroting this logic, ignoring that proceeds of crime are criminal themselves.