Hacker News new | ask | show | jobs
by chefandy 889 days ago
Not a subject matter expert, so forgive me if the question is unintentionally obtuse, but It seems like a reasonable statement. They seem to be inferring that problems with existing public solutions wouldn't be a good indicator of performance in solving novel problems-- likely a more important evaluative measure than how fast it can replicate an answer it's already seen. Since you couldn't know if that solution was in its training data, you couldn't know if you were doing that or doing a more organic series of problem solving steps, therefore contaminating it. What's the problem with saying that?
1 comments

They're referring to "performance of performance".

Not that it's a big deal. I notice problems like this slip into my writing more and more without detection as I get older :/

I don't see the sentence in the Nature paper, though.

Paper: https://www.nature.com/articles/s41586-023-06747-5

Section title: Geometry theorem prover baselines

Second paragraph under that. 3rd sentence from the end of the para.

Ah-- I've got super bad ADHD so I usually don't even catch things like that. I'm a terrible editor. I wouldn't be surprised if the paper authors were paying attention to this thread, and gave a call to Nature after seeing the parent comment.
The text is still there.

Paper: https://www.nature.com/articles/s41586-023-06747-5

Section title: Geometry theorem prover baselines

Second paragraph under that. 3rd sentence from the end of the para.