| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by chefandy 889 days ago
	Not a subject matter expert, so forgive me if the question is unintentionally obtuse, but It seems like a reasonable statement. They seem to be inferring that problems with existing public solutions wouldn't be a good indicator of performance in solving novel problems-- likely a more important evaluative measure than how fast it can replicate an answer it's already seen. Since you couldn't know if that solution was in its training data, you couldn't know if you were doing that or doing a more organic series of problem solving steps, therefore contaminating it. What's the problem with saying that?

1 comments

mlyle 889 days ago

They're referring to "performance of performance".

Not that it's a big deal. I notice problems like this slip into my writing more and more without detection as I get older :/

I don't see the sentence in the Nature paper, though.

link

1024core 888 days ago

Paper: https://www.nature.com/articles/s41586-023-06747-5

Section title: Geometry theorem prover baselines

Second paragraph under that. 3rd sentence from the end of the para.

link

chefandy 889 days ago

Ah-- I've got super bad ADHD so I usually don't even catch things like that. I'm a terrible editor. I wouldn't be surprised if the paper authors were paying attention to this thread, and gave a call to Nature after seeing the parent comment.

link

1024core 888 days ago

The text is still there.

Paper: https://www.nature.com/articles/s41586-023-06747-5

Section title: Geometry theorem prover baselines

Second paragraph under that. 3rd sentence from the end of the para.

link