| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jsemrau 517 days ago
	I have been doing some experiments with Agents, Reinforcement Learnings playing a 4x4 Tic Tac Toe game.[1]. Given my analysis of the "thought" process we are still really far from true understanding of such games. While in my game as well as OP"s, the rules are pre-trained and the models are good enough to reach a conclusion (which in itself is already impressive), it is still a long way. [1] https://jdsemrau.substack.com/p/nemotron-vs-qwen-game-theory...