| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by retinaros 39 days ago
	that is for sure what everyone does. also they train on evals with the datasets that they would be bench against.

1 comments

tedsanders 39 days ago

What do you mean by this? We don’t train on evals, and if we did I’d quit on the spot.

(The loose version of this that’s true is that there may exist eval data contamination in pretraining. This is a hard problem to fully solve.)

link

retinaros 39 days ago

its not that loose of a version. its the reality and as probably is surely a focus of a dedicated post training RL-ing these kind of githubs. of course you would train specifically on the task. you would mix this eval data with others in thousands of githubs repos.

link