|
|
|
|
|
by ekez
409 days ago
|
|
I wonder if the authors have tried incorporating error feedback from Lean into their models. Work from 2023 [1] showed general purpose models did better when they were able to incorporate error feedback, humans incorporate error feedback, but none of the SOTA models on minif2f seem to. [1]: https://arxiv.org/abs/2310.04353 |
|
This is distinct from the approach of the previous SOTA for an open-weights model (Kimina Prover) which generated at the full-proof level.
While it was very impressive to see Kimina's ability to generate medium-length proofs (think AIME-level problems) without sub-goals or feedback at intermediate steps, it's likely that at least subgoal decomposition will be required for longer proofs (think IMO-level problems.)
I certainly agree that where and how error/proof state feedback is best incorporated (training data synthesis / reward function / CoT during inference / etc.) is a fascinating area of research. (It's rumored that GDM's AlphaProof does use proof state / lean feedback already.)