|
|
|
|
|
by jsemrau
517 days ago
|
|
I have been doing some experiments with Agents, Reinforcement Learnings playing a 4x4 Tic Tac Toe game.[1]. Given my analysis of the "thought" process we are still really far from true understanding of such games. While in my game as well as OP"s, the rules are pre-trained and the models are good enough to reach a conclusion (which in itself is already impressive), it is still a long way. [1] https://jdsemrau.substack.com/p/nemotron-vs-qwen-game-theory... |
|