Hacker News new | ask | show | jobs
by agentultra 1108 days ago
> Many undergraduates also confidently regurgitate incorrect proofs of linear algebra theorems, do you consider them completely lacking in reasoning ability?

No. Because I can ask them questions about their proof, they understand what it means, and can correct it on their own.

I've seen LLM's correct their answers after receiving prompts that point out the errors in prior outputs. However I've also seen them give more wrong answers. It tells me that they don't "understand" what it means for an expression to be true or how to derive expressions.

For that we'd need some form of deductive reasoning; not generating the next likely token based off a model trained on some input corpus. That's not how most mathematicians seem to do their work.

However I think it seems plausible we will have a machine learning algorithm that can do simple inductive proofs and that will be nice. To the original article it seems like they're taking a first step with this.

In the mean time why should anyone believe that an LLM is capable of deductive reasoning? Is a tensor enough to represent semantics to be able to dispatch a theorem to an LLM and have it write a proof? Or do I need to train it on enough proofs first before it can start inferring proof-like text?

1 comments

I suspect you have adopted the speech patterns of people you respect criticizing LLMs of lacking “reasoning” and “understanding” capabilities without thinking about it carefully yourself.

1. How would you define these concepts so that incontrovertible evidence is even possible. Is “reasoning” or “understanding” even possible to measure? Or are we just inferring by proxy of certain signals that an underlying understanding exists?

2. Is it an existence proof? I.e we have shown one domain where it can reason, therefore reasoning is possible. Or do we have to show that it can reason on all domains that humans can reason in?

3. If you posit that it’s a qualitative evaluation akin to the Turing test, specify something concrete here and we can talk once that’s solved too.