| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by huac 481 days ago
	that comment refers to the test time inference, i.e. what the model is prompted with, not to what it is trained on. this is, of course, also a tricky problem (esp over long context, needle in a haystack), but it should be much easier than memorization. anyways, another interpretation is that the model needs to also make a decision on if the code in the issue is a reliable fix or not too

1 comments

sebzim4500 480 days ago

Then I don't understand what he's suggesting. It is obviously not the case that 1/3 of the questions int he SWE-bench dataset have the solution in as part of the issue that is provided to the model. You can just download it and look. The solution is likely in the training data though.

link