| LLMs are trained to sound confident. But they can also only do negation through exhaustion, known unknowns, future unknowns, etc... That is the pain of the Entscheidungsproblem. Even in Presburger arithmetic, Natural numbers will addition and equality, which is decidable, still has a double factorial time complexity to prove. That is worse than factorial time for those who've not dealt with it. Add in multiplication then you are undecidable. Even if you decided to use the dag like structure of transformers, causality is very very hard. https://arxiv.org/abs/1412.3076 LLMs only have cheap access to their model probables which aren't ground truth. So while asking for a pizza recipe could be called out as a potential joke if add a topping that wasn't in its training set, through exhaustion, It can't know when it is wrong in the general case. That was an intentional choice with statistical learning and why it was called PAC (probably approximately correct) learning. That was actually a cause of a great rift with the Symbolic camp in the past. PAC learning is practically computable in far more cases and even the people who work in automated theorem proving don't try to prove no-instances in the general case. There are lots of useful things we can do in BPP (bounded probabilistically polynomial time) and with random walks. But unless there are major advancements in math and logic, transformers will have limits. |