Yeah but the Cheryl's birthday problem doesn't have any ambiguity like that. It's all in very simple language, the only complexity is keeping track of states of mind, which is easy to abstract away from the language
That is exactly the point I was making in my comment above. This type of unambiguous problem is best solved using formal languages - something more like quantitative reasoning. But stuff like prolog or classical automated reasoning approaches are quite brittle. They break down quickly when you start to introduce ambiguity and noise. Statistical approaches like hidden markov models that people used in these instances were the precursor to the LLMs we have today.
But I was going down a rabbit hole there. My main point is that trying to use LLMs to solve logic puzzles - that can easily be solved in prolog - is a waste of time and a failure of the imagination. The applications that should be explored and would be most fruitful are those where there is ambiguity and contradiction.
But I was going down a rabbit hole there. My main point is that trying to use LLMs to solve logic puzzles - that can easily be solved in prolog - is a waste of time and a failure of the imagination. The applications that should be explored and would be most fruitful are those where there is ambiguity and contradiction.