|
|
|
|
|
by HarHarVeryFunny
654 days ago
|
|
An LLM will learn what it CAN (and needs to, to reduce the loss), but not what it CAN'T. How difficult is that to understand?! Tracking probable board state given a sequence of moves (which don't even need to go all the way back to the start of the game!) is relatively simple to do, and doesn't require hundreds of sequential steps that are beyond the architecture of the model. It's just a matter of incrementally updating the current board state "hypothesis" per each new move (essentially: "a knight just moved to square X, so it must have moved away from some square a knight's move away from X that we believe currently contains a knight"). Ditto for estimating player ELO rating in order to predict appropriately good or bad moves. It's basically just a matter of how often the player makes the same move as other players of a given ELO rating in the training data. No need for hundreds of steps of sequential computation that are beyond the architecture of the model. Doing an N-ply lookahead to reason about potential moves is a different story, but you want to ignore that and instead throw out a straw man "counter argument" about maintaining board state as if that somehow proves that the LLM can magically apply > N=layers of sequential reasoning to derive moves. Sorry, but this is precisely magical faith-based thinking "it can do X, so it can do Y" without any analysis of what it takes to do X and Y and why one is possible, and the other is not. |
|
Right and the point is that you don't know what it CAN'T learn. You clearly don't quite understand this because you say stuff like this:
>Chess is a good example, since it's easy to understand. The generative process for world class chess (whether human, or for an engine) involves way more DEPTH (cf layers) of computation than the transformer has available to model it
and it's just baffling because:
1. Humans don't play chess anything like chess engines. They literally can't because the brain has iterative computation limits well below that of a computer. Most Grandmasters are only evaluating 5 to 6 moves deep on average.
2. We have a chess transformer playing world class chess (grandmaster level) - https://arxiv.org/abs/2402.04494.
You keep trying to make the point that because a Transformer architecturally has a depth limit for some trained model, a, it cannot reach human level.
But this is nonsensical for various reasons.
- Nobody is stopping you from just increasing N such that every GI problem we care about is covered.
- You have shown literally no evidence that the N even state of the art models posses today is insufficient to match human iterative compute ability.
GPT-4o instant shots arbitrary arithmetic more accurately than any human brain and that's supposedly something it's bad at. You can clearly see it can reach world class chess play.
If you have some iterative computation benchmark that shows transformers zero shotting worse than an unaided human then feel free to show me.