Hacker News new | ask | show | jobs
by solarhexes 848 days ago
If by “understand” you mean “can model reasonably accurately much of the time” then maybe you’ll find consensus. But that’s not a universal definition of “understand”.

For example, if I asked you whether you “understand” ballistic flight, and you produced a table that you interpolate from instead of a quadratic, then I would not feel that you understand it, even though you can kinda sorta model it.

And even if you do, if you didn’t produce the universal gravitation formula, I would still wonder how “deeply” you understand. So it’s not like “understand” is a binary I suppose.

3 comments

Well what would you need to see to prove understanding? That's the metric here. Both the LLM and the human brain are black boxes. But we claim the human brain understands things while the LLM does not.

Thus what output would you expect for either of these boxes to demonstrate true understanding to your question?

It is interesting that you are demanding a metric here, as yours appears to be like duck typing: in effect, if it quacks like a human...

Defining "understanding" is difficult (epistemology struggles with the apparently simpler task of defining knowledge), but if I saw a dialogue between two LLMs figuring out something about the external world that they did not initially have much to say about, I would find that pretty convincing.

Without a metric no position can be made. All conversation about this topic is just conjecture with no path to a conclusion.
This is a common misunderstanding, one also seen with regard to definitions. When applied to knowledge acquisition, it suffers from a fairly obvious bootstrapping problem, which goes away when you realize that metrics and definitions are rewritten and refined as our knowledge increases. Just look at what has happened to concepts of matter and energy over the last century or so.

You are free to disagree with this, but I feel your metric for understanding resembles the Turing test, while the sort of thing I have proposed here, which involves AIs interacting with each other, is a refinement that makes a step away from defining understanding and intelligence as being just whatever human judges recognize as such (it still depends on human judgement, but I think one could analyze the sort of dialogue I am envisioning more objectively than in a Turing test.)

No it's not a misunderstanding. Without a concrete definition on a metric comparisons are impossible because everything is based off of wishy washy conjectures on vague and fuzzy concepts. Hard metrics bring in quantitative data. It shows hard differences.

Even if the metric is some side marker where in the future is found to have poor correlation or causation with the the thing being measured the hard metric is still valid.

Take IQ. We assume iq measures intelligence. But in the future we may determine that no it doesn't measure intelligence well. That doesn't change the fact that iq tests still measured something. The score still says something definitive.

My test is similar to the Turing test. But so is yours. In the end there's a human in the loop making a judgment call.

This is rather self-contradictory: you insist we can't make progress with wishy-washy conjectures on vague and fuzzy concepts, and yet your entire argument in this thread for your claim that machine understanding of the real world has been achieved is based on exactly that: your personal subjective assessment of LLM performance!

In your final paragraph, you attempt to suggest that my proposed test is no better than the Turing test (and therefore no better than what you are doing), but as you have not addressed the ways in which my proposal differs from the Turing test, I regard this as merely waffling on the issue. In practice, it is not so easy to come up with tests for whether a human understands an issue (as opposed to having merely committed a bunch of related propositions to memory) and I am trying to capture the ways in which we can make that call.

You entered this debate saying "I think we are way past the point of debate here. LLMs are not stochastic parrots. LLMs do understand an aspect of reality", yet your post here ends with "in the end there's a human in the loop making a judgment call", explicitly acknowledging that your strong initial claims are matters of opinion, rather than established facts supported by hard metrics.

Are you telling me that WW1 artillery crews didn't understand ballistics? Because they were using tables.

There's no difference between doing something that works without understanding and doing the exact same thing with understanding.

You’ve decided that your definition of “understanding” is correct. Ok.
The author of the post to which you are replying seems to be defining "understanding" as merely meaning "able to do something."
The author of the post is saying that understanding something can't be defined because we can't even know how the human brain works. It is a black box.

The author is saying at best you can only set benchmark comparisons. We just assume all humans have the capability of understanding without even really defining the meaning of understanding. And if a machine can mimic human behavior to it must also understand.

That is literally how far we can go from a logical standpoint. It's the furthest we can go in terms of classifying things as either capable of understanding or not capable or close.

What you're not seeing is the LLM is not only mimicking human output to a high degree. It can even produce output that is superior to what humans can produce.

What the author of the post actually said - and I am quoting, to make it clear that I'm not putting my spin on someone else's opinion - was "There's no difference between doing something that works without understanding and doing the exact same thing with understanding."
I'm the author. To be clear. I referred to myself as "the author."

And no I did not say that. Let me be clear I did not say that there is "no difference". I said whether there is or isn't a difference we can't fully know because we can't define or know about what "understanding" is. At best we can only observe external reactions to input.

I think there are two axes: reason about and intuit. I "understand" ballistic flight when I can calculate a solution that puts an artillery round on target. I also "understand" ballistic flight when I make a free throw with a basketball.

On writing that, I have an instinct to revise it to move the locus of understanding in the first example to the people who calculated the ballistic tables, based on physics first-principles. That would be more accurate, but my mistake highlights something interesting: an artillery officer / spotter simultaneously uses both. Is theirs a "deeper" / "truer" understanding? I don't think it is. I don't know what I think that means, for humans or AI.