| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by energy123 511 days ago
	If OpenAI wanted the questions/solutions, there is going to be a reason for that. This data is not sitting in an unopened folder on Sam's computer. There are a lot of ways you can use data to improve a model without directly training on it. A train/test validation loop, for example. Or as a wellspring for synthetic data generation. But all of these ways involve some level of data contamination, it's unavoidable.