Hacker News new | ask | show | jobs
by thisismyswamp 920 days ago
Playing chess & go is also search in a large tree of moves leading to particular game states
2 comments

But AlphaGo etc don’t use any kind of language-based AI, so LLMs (which this thread was about) are no good.
The next step seems to be applying past advances in reinforcement learning with modern transformer based models
Which multiple teams are working on - OpenAI (Q*), and Meta just released a reinforcement learning framework
Could you point me towards Meta's reinforcement learning framework? I'd like to see how it stacks up against the OpenAI gym.
Thank you!
The final state in chess is a single* state which yes, then branches out to N checkmate configurations and then N*M one-move-from-checkmates, and so on. (*Technically it's won/lost/draw.)

The equivalent final state in theorem proving is unique to each theorem so such a system would need to handle an additional layer-of-generalization.

Is this how some of the more advanced chess engines work, or even the not so advanced ones, where there's a point at which it stops searching the forward move tree in greatest depth, and instead starts searching backwards from a handful of plausible (gross move limit-bound) checkmate states looking for an intersection with a shallow forward search state?
Kind of, but it's calculated offline and then just accessed during the game: https://www.chessprogramming.org/Endgame_Tablebases