| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dash2 692 days ago
	AlphaZero is more general than a Go and Chess AI, right? Isn't it a general self-play algorithm?

2 comments

sapiogram 692 days ago

Only slightly more general. It only works for games that are zero-sum, deterministic, have no hidden information, and discrete game state and moves. Other examples include connect-4.

link

cubefox 692 days ago

So finding Lean proofs can be conceptualized as a zero-sum game?

Another basic requirement is that valid moves / inference steps and the winning condition can be efficiently verified using some non-AI algorithm. Otherwise there would not be a reward signal for the reinforcement learning algorithm. This is different from answering most natural language questions, where the answer can't be checked trivially.

link

fmap 692 days ago

Theorem proving can be formulated as a game, see e.g., https://plato.stanford.edu/entries/logic-games/ and interactive theorem provers can verify that a proof is correct (and related sub problems, such as that a lemma application is valid).

Conceptually, if you're trying to show a conjunction, then it's the other player's turn and they ask you for a proof of a particular case. If you're trying to show a disjunction then it's your turn and you're picking a case. "Forall" is a potentially infinite conjunction, "exists" is a potentially infinite disjunction.

In classical logic this collapses somewhat, but the point is that this is still a search problem of the same kind. If you want to feel this for yourself, try out some proofs in lean or coq. :)

link

sapiogram 692 days ago

I don't think AlphaZero is related to this work, apart from both being NN-based. AlphaZero and its training pipeline fundamentally only works for "chess-like" two-player games, where the agent can play against itself and slowly improve through MCTS.

link

adroniser 692 days ago

"AlphaProof is a system that trains itself to prove mathematical statements in the formal language Lean. It couples a pre-trained language model with the AlphaZero reinforcement learning algorithm, which previously taught itself how to master the games of chess, shogi and Go."

link

redman25 692 days ago

I think that it can only play “games” for which it has perfect information about the state of the “game”.

link