|
|
|
|
|
by _heimdall
698 days ago
|
|
This is really interesting. I would have expected the understanding to be that humans make a guess, test it, and learn from what did or did not work. The lessons learned from the prior tests would impact future guesses. Do you know if a system like the OP is learning from failed tests to guide future tests, or is it a truly a brute force search as if it were trying to mine bitcoin? |
|
>We trained AlphaProof for the IMO by proving or disproving millions of problems, covering a wide range of difficulties and mathematical topic areas over a period of weeks leading up to the competition. The training loop was also applied during the contest, reinforcing proofs of self-generated variations of the contest problems until a full solution could be found.