Hacker News new | ask | show | jobs
by nextworddev 624 days ago
The majority of humans in flesh can't solve the problem - so we need alternate measures for judging theory of mind capabilities in LLMs
4 comments

What about the difference that the human knows what they don't know?

In contrast, the LLM knows nothing, but confidently half regurgitates correlational text that it is seen before.

As far as all the research on this goes, LLMs (internally) mostly know what they know, but incentivizing that information making it to output is difficult.
Well the majority of trained neural network models fail to solve most problems they're tested on, also. When you pick up a neural net paper (or really any machine learning paper) and look at the results of experiments the authors will invariably list the results of their best-performing model, chosen by careful cross-validation over multiple random seeds.

The point of this testing is to estimate the true learning ability of a system by looking at what it can do in the best case scenario. Nobody reports the worst-case scenario, otherwise deep neural nets would look a lot less impressive than they do.

This doesn't measure theory of mind at all, it's just a silly logic puzzle. What we need are AI researchers who have read a psychology book and understand what theory of mind experiments are actually trying to demonstrate.

  > The majority of humans in flesh can't solve the problem
This is a non sequitur.

All humans have the capacity to reason, but not all humans reason. There's a big difference. Hell, even GPT-4o will tell you this.

Here, don't trust me, let's ask GPT about memorization and reasoning

  Prompt: 
  Suppose a LLM is able to get the right answer to a question, but when asked about the steps to arrive at the answer they are not able to provide an accurate solution. Does this tell us anything about a model's ability to reason or its ability to memorize?  What about when it gets the right answer but the explanation would lead to a different answer than the one stated?

  Response:
  These scenarios highlight important aspects of a model's capabilities and limitations regarding reasoning and memorization.  Let's break them down:


  1. **LLM Gets the Right Answer but Can't Explain the Steps:**
    - **Memorization vs. Reasoning:** If a model provides a correct answer without accurately explaining the steps, it might indicate that the model is relying more on memorization rather than understanding or reasoning. The model could have seen similar questions in its training data and learned to associate certain questions with specific answers without truly understanding the underlying principles.
    - **Reasoning Limitations:** This situation suggests that the model may not have a robust ability to reason through the problem-solving process. It highlights a gap between recognizing patterns in data and applying logical reasoning to derive answers.

  2. **LLM Provides Correct Answer but Incorrect Explanation:**
    - **Memorization of Answers:** This might suggest that the model has memorized the answer from similar examples in the training data but does not understand the reasoning process. It could be retrieving an answer that it "knows" is correct without understanding why.
    - **Inconsistent Reasoning Ability:** Giving a correct answer with an explanation that would lead to a different answer indicates inconsistencies in its reasoning ability. It may reflect that the model's internal heuristics for generating explanations are not aligned with the logic used to derive answers.

  In both cases, these issues highlight the challenges in AI related to understanding vs. pattern recognition. While LLMs are excellent at recognizing patterns and retrieving information, their ability to reason logically and consistently can be limited. This differentiation is crucial when evaluating the capabilities of AI models, particularly in contexts where understanding and accurate reasoning are essential.