Hacker News new | ask | show | jobs
by 6gvONxR4sf7o 1774 days ago
> The "language models don't really understand anything" corner is getting smaller and smaller.

In my mind, understanding a thing means you can justify an answer. Like a student showing their work and being able to defend it. An answer with a proof understands the answer with respect to the proof it provides. E.g. to understand an answer with regards to first order logic, it'll have to be able to defend a logical deduction of that answer.

These models still can't justify their answers very well, so I'd say they're accurate but only understand with respect to a fairly dumb proof system (e.g. they can select relevant passages or just appeal to overall accuracy statistics). They're still far from being able to justify answers in the various ways we do, which I'd say means that by definition that they still don't understand with regards to the "proof systems" that we understand things with regards to.

Maybe the next step will require increasingly interesting justification systems.

7 comments

> In my mind, understanding a thing means you can justify an answer.

Do you understand cats? If I show you a picture of either a cat or a dog do you think you can tell which one it is? I think most people could solve that challenge, and if pressed they could vax poetically about what makes them think it is a cat. Maybe they would mention the shape of an ear, or talk about feline grace or what have you. But is that really a “justification”? Let alone one they can “defend”? How would “defending” even work in this situation?

You could probably teach an AI to post-hoc rationalize their decisions, the same way people do.
You absolutely could, and it could even end up just as accurate as human post-hoc rationalization. ;)

Self-analysis and self-interpretation is pretty clearly a key part of consciousness... I do wonder - how important to the process is the actual fidelity of the interpretation? Those people you meet who think they have deep insight into their own psyche while clearly having no clue... maybe they're p-zombies. ;)

That’s basically the gist of explainable AI
The point I’m trying to make (poorly) is that i don’t think a one size fits all definition of “understanding” is useful. It’s more useful to define understanding with respect to how you can justify a thing you know.

So for the case of cats, I will understand cats at a different level from a cat biologist. I can point to features that seem catlike, and they can talk about all sorts of other scientific things that make a cat a cat.

With respect to that sciency kind of understanding, I don’t understand cats. With respect to a much looser ‘point at the features’ kind of understanding, I do understand cats.

Take entomologists, bird watchers or those who identify mushrooms. In each, there are instances that are fiendishly difficult to tell apart. If you ask an expert for advice, they'll tell what features to look for and where, sometimes not even on the item itself and some requiring specialists tools.

While explanations are far from sufficient to instantly close the gap to expertise, they provide a massive boost that you might not otherwise have found on your own. The justification comes from the fact that their explanations are a reliable source of increased performance in making fine-grained distinctions. It's further demonstrated by answers to questions like "If they are so difficult to tell apart, why make these distinctions?" or "How did they come to be so similar?".

> In my mind, understanding a thing means you can justify an answer.

What if the language model can generate a step-by-step explanation in the form of text? [0]

There's no guarantee that the reasoning was used to come up with the answer in the first place, and no proof that the reasoning isn't just the product of "a really fancy markov chain generator", but would you accept it?

We're really walking into Searle's Chinese Room at this point.

[0] https://nitter.hu/kleptid/status/1284069270603866113#m

Umm, no there are clear verification methods for Explainable AI techniques today. One way to check the justification would be if things which were important in the justification were removed in some sense, then would the output change signficantly. Sort of like a sensitivity test for justification.
Searle's Chinese Room is exactly why I started thinking of understanding this way. It convinced me that a one-size-fits-all notion of understanding isn't useful. But it also made me think that understanding "with respect to system X" is useful.

If you can challenge an answer and get justification expressed in the form of X, then it understands with respect to X. A step-by-step text explanation is one form of X.

> ... but would you accept it?

This is all to sidestep questions of whether you accept X as "real" understanding or not. :D

>There's no guarantee that the reasoning was used to come up with the answer in the first place, and no proof that the reasoning isn't just the product of....

You're holding machines to a higher standard than we hold people.

Look at the "math test" video.

Given the question: "Jane has 9 balloons. 6 are green and the rest are blue. How many balloons are blue?" The model outputs: "jane_balloons = 9; green_balloons = 6; blue_balloons = jane_balloons - green_balloons; print(blue_balloons)"

That seems like a good justification of a (very simple) step-by-step reasoning process!

I wonder what would it have outputted if we would remove the “ and the rest are blue” part from the question.

Would not surprise me if an innatentive human student would answer that with the same code. After all school “trains” people to expect such challenges to be solveable. A more attenive human might say “we can’t know” or provide an upper limit to the number of potential blue balloons.

Related article: Teaching GPT-3 to Identify Nonsense

https://arr.am/2020/07/25/gpt-3-uncertainty-prompts/

chances are high that something similar was in training set, and model approximated it.
You are very likely right. The question is how far the approximation can generalise? One way to test that would be to quizz the model with slightly varied prompts. Any human who can “solve” this world problem should be reasonably expected to solve the same problem if we change the subject’s name. ( From Jane to Bob, or Sanj, or even to Xcfg.) Or the name of the object (From balloon to token, or even to embobler). Or the attributes used to segment them. (From red/blue to heavy/light for example)

Or we can try to rewrite the challenge sentences with different wording. As long as the new sentences convey the same problem you would expect that a system who can “understand” them would generate the same or similar solution.

Curiously this kind of thought experiment also shows a weakness of the Turing-test as originally formulated. A machine correctly solving these word puzzle variations could “prove” that it “understands” the sentences, but it would also reveal that it is not a human. Since i would expect a real human to protest against the inanity of the challenges quite fast. ;)

This goes for humans too. Ultimately, "something similar was in the training set" is semantically indistinguishable from "having a rich generalizable conceptual toolbox".
Except I could do that with a few regex substitutions, which would not be reasoning. The “intelligence” is in the templates provided by the training data. (Extracting that is impressive, but not that impressive.)
>In my mind, understanding a thing means you can justify an answer.

Sure, but how does that work with superhuman AI? Consider some kind of math bot that proves theorems about formal systems which are just flat out too large to fit into human working memory. Even if it could explain its answers, there would just be too many moving parts to keep in your head at once.

We already see something this in quant funds. The stock trading robot finds a price signal, and trades on it. You can look at it, but it's nonsensical: if rainfall in the Amazon basin is above this amount, and cobalt price is below this amount, then buy municipal bonds in Topeka. The price signal is durable and casual. If you could hold the entire global economy in your head, you could see the chain of actions that produce the effect, but your brain isn't that big.

Or you just take it on faith. Why do bond prices in Topeka go up, but not in Wichita? "It just does." Okay, then what was the point of the explanation? A machine can't justify something you physically don't have enough neurons to comprehend.

It's not about us being able to interpret answer or justification, but the reasoner's ability to justify. If a superhuman AI can justify its answers in terms of first order logic, for example, it could be defined as understanding the answers with respect to FOL. Whether we as humans are able to check whether this specific bot in fact meets that definition is a separate empirical question.

If that quant algo you mentioned just says "it'll go up tomorrow" that's different than "it'll go up tomorrow" with an attached "it's positively correlated with Y, which is up today" which is different from a full causal DAG model of the world attached, which is again different from those same things expressible in english. But again, those are definitions, which are separate from our ability to check whether they're met.

Luckily, we're not in the realm of bots spitting out unfeasible to check proofs, except for a few niche areas like theorem proving (e.g. four color theorem). For language models like in the article, the best I'm aware of is finding relevant passages to an answer and classifying entailments.

> A machine can't justify something you physically don't have enough neurons to comprehend.

We can't always verify its justification, but it either can or can't justify an answer with respect to a given justification system.

Also, you should note the memory and capabilities required to reach a conclusion might be much greater than to show it's true. Showing a needle may be easy, finding it in the haystack very hard. In this sense the hope for explainability is expanded. But still, I guess the real world is really messy "the full explanation" may be too large -- like when you explain a human intuition, the "full explanation" might have been your entire brain, your entire set of experiences up to that point; yet we can give partial explanations that should be satisfactory

A have a hypothesis that inevitably, reasoning needs to 'funnel' through explicit, logical representations (like we do with mathematics, language, etc.) to occur effectively. Or at least (quasi-)formalization is an important element of reasoning. This formal subset can be communicated.

> Even if it could explain its answers, there would just be too many moving parts to keep in your head at once.

While this is possible in practice, consider the (universal) Turing machine principle: in principle, you can simulate any system given enough memory; we may not have it our brains, but we have pen and paper or simply digital text scratchpad (both of which we use extensively in our lives).

We build another system we fully understand that can process the justification and see if it is correct/makes sense.
What about GPT-f? It's a language model that proved theorems in the metamath formal system.
I'd definitely say it understands those theorems with respect to the metamath formal system then. The next question is what it understands the proofs with respect to.
> Maybe the next step will require increasingly interesting justification systems.

You can just ask it to comment what it intends to do. It's surprising actually.

I found it on Stack Overflow!