| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by _heimdall 698 days ago
	This is really interesting. I would have expected the understanding to be that humans make a guess, test it, and learn from what did or did not work. The lessons learned from the prior tests would impact future guesses. Do you know if a system like the OP is learning from failed tests to guide future tests, or is it a truly a brute force search as if it were trying to mine bitcoin?

1 comments

Thorrez 698 days ago

This quote from the article sounds like it learns from failed tests:

>We trained AlphaProof for the IMO by proving or disproving millions of problems, covering a wide range of difficulties and mathematical topic areas over a period of weeks leading up to the competition. The training loop was also applied during the contest, reinforcing proofs of self-generated variations of the contest problems until a full solution could be found.

link

_heimdall 698 days ago

Reading between the lines a bit, that does answer the question I had though don't think I I clarified very well.

I read that to say the model's token weights are adjusted as it goes, so in an LLM sense it is kind of learning. It isn't reasoning through an answer in the way a human does though. Meaning, the model is still just statistically predicting what an answer may be and checking if it worked.

I wouldn't chalk that up to learning at all. An AI solving complex math doesn't even seem too impressive to me with the predictive loop approach. Computers are well adept at math, throwing enough compute hardware at it to brute force an answer isn't suprising. I'd be really impressed if it could reliably get there with a similar number of failed attempts as a human, that could indicate that it really learned and reasoned rather than rammed through a mountain of failed guesses.

link

Thorrez 698 days ago

>with a similar number of failed attempts as a human

I'd be hard to know how many failed attempts the human made. Humans are constantly thinking of ideas and eliminating them quickly. Possibly to fast to count.

link

_heimdall 698 days ago

Ive never competed in math competitions at this level, but I would have expected it to be pretty clear to the human when they tested a different solution. As complex as the proofs are, is it really feasible that they are testing out a full proof in their head without realizing it?

link

Thorrez 697 days ago

Hmm, I think it comes down to what the definition of "testing" and "attempt". A human will generate many ideas, and eliminate them without creating full proofs, by just seeing that the idea is going in the wrong direction.

It sounds like AlphaProof will doggedly create full proofs for each idea.

Is what the human is doing testing attempts?

link

sdenton4 698 days ago

Computers are good at arithmetic, not math...

There's definitely an aspect of this that is 'airplanes, not birds.' Just because the wings don't flap doesn't mean it can't fly, though.

link

_heimdall 698 days ago

That's totally fair, though wouldn't the algorithm here have to reduce the math proofs to arithmetic that can be computed in silico?

link