|
|
|
|
|
by zeknife
723 days ago
|
|
Talking about sentence structure in the conventional sense may not be meaningful here, since what could be described as reasoning in LLM's happens in a more abstract space. If we're looking to understand why a small change makes a big difference, it's pretty intuitive to consider that the second instance of "her" is modified by "mother" due to attention, and ends up being a wildly different vector. Regardless, it's reasonable to assume that certain aspects of the prompt or input structure will prime the model to be more scrutinizing. I'd be surprised to see it point out a logical inconsistency like this if it was just part of a broader context and it wasn't asked "what it thinks" or to "be logical" |
|
I would guess that the human mind does this abstraction behind the scenes invisibly, screwing up our intuition when analyzing how LLM's work. I wonder if using examples that are counterintuitive to human intuition might offer insight, because humans reveal their perceived logical thinking is not actually that (rather, is heuristics) in their post-hoc rationalization of the "logic" they believe their mind executed to produce the answer.
(I don't think I articulated what I'm thinking here very well...or, perhaps I have fallen victim to my very own theory!)
A bit more effort...the text is converted into not only tokens, but also abstract tokens, and it is because of the translation into abstract tokens that it is able to match it to training data (which would also have to be translated into abstract tokens). How it resolves the inconsistency after that translation though is beyond me, but it wouldn't surprise me if it is (in this case) a rather trivial problem to someone with depth in logic or some other related discipline.