| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by steveridout 1003 days ago

At first glance this doesn't seem that surprising. We often use "is" in a way which isn't reversible. e.g.

"A dog is an animal" -> Makes sense

"An animal is a dog" -> Doesn't make sense

11 comments

Majromax 1003 days ago

> At first glance this doesn't seem that surprising. We often use "is" in a way which isn't reversible. e.g.

They appear to only be testing the 'reliable' cases. There schematic example was fine-tuning the model on "<Fictitious name> is the composer of <fictitious album>", yet having the model be unable to answer "Who composed <fictitious album>"?

In this case, English and common sense force symmetry on 'is'. Without further specification, these kinds of prompts imply an exclusive relationship.

Additionally, the authors claim that when they tested it, the model didn't even rate the correct answer more probable than random chance. This suggests that the model isn't being clever about logical implications.

phire 1003 days ago

To us, it's obvious that "is" in these examples is symmetrical. But LLMs don't have common sense, they have to rely on the training dataset we feed them.

It's entirely possible there is nothing wrong with the logical reasoning abilities of LLM architectures and this result is simply an indication the training data doesn't provide enough infomation for LLMs to learn the symmetrical/commutative nature of these "is" relationships.

Though, based on the find-the-next-token architecture of LLMs, it seems logical that LLM should need to learn facts in both directions. If it's input set contains <Fictitious name>, it makes sense the tokens for "<fictitious album>" and "composer" will show up with high probability. But there is no reason that having the tokens "composer" and "<fictitious album>" in the input set should increase the probability of the "<fictitious name>" token, because that ordering never occurred in the training data.

If true, it would would suggest that LLMs have a massive bias against the very concept of symmetrical logic and commutative operations.

wongarsu 1003 days ago

The "is" in that sentence still isn't fully symmetric, I'd rather call it reversible. There is a learned relationship that "is composer of" has the same meaning as "composed" (as in "<Name> composed <Album>"). Now you can turn the active verb passive to switch subject and object: <Album> was composed by <Name>.

The final puzzle piece is then recognizing the difference between the question "Who composed <x>" and "Who did <x> compose", one asking for the object of the passive sentence and one for the object of the active sentence.

In a "traditional" system without ML you would represent this with a directional knowledge graph <Artist> --composed--> <Album>, with the system then able to form sentences or answer questions in either arrow direction. But that conversion is generally tricky unless you know how many other arrows exist. That's obvious with categories, but even if you know that one person composed a song that doesn't tell you that only that person composed that song. That can lead to unsatisfying answers, and might be a reason why this is hard for LLMs.

HerculePoirot 1002 days ago

My random reflexions on this topic make me think there is something deep about identity/equivalence in LLMs that is on par with the special status identity/equivalence have in homotopy type theory.

• GPT4 (and other LLMs) is some kind of generalized homotopy engine. You can give it any input, ask it to apply any "translation". Language translation, style translation, or even keeping the style but talking about another subject, or translating code to another programming language – and it gives you something different, yet identical. "Write something like ... but ..." There is some deep understanding of what identity is here, in particular with respect to the messy expectations of our human sign systems: you can throw any kind of equivalence path, and GPT4 will handle them just fine. It seems the limit is not in its ability to generalize to any kind of identity schema we throw at it, but in the complexity of these schemas.

• I'm not saying GPT has an explicit understanding of these schemas/homotopies. My point is that even though GPT doesn't know much about homotopy type theory, I think it knows them in a latent way: GPT would perform much better at translating a piece of code in one language to another than it'd be at explaining what it just did in sound terms what through the lens of homotopy type theory. That knowledge about identity/equivalence is implicit.

The rest of my thoughts: https://pastebin.com/zSKHKqw3

Note: I'm not claiming to have a clear view of what's at stake here, just that there is a link between textuality, identity, and the foundations of logical inference

phire 1002 days ago

I know nothing about homotopy type theory, but your description does line up with my experiments.

When playing with gpt 3.5, I gave it a conversation and asked it to "translate" one side of a conversation from "sarcastic mocking GLaDOS" to "concise professional language". It did an impressive job at the transform, but obviously, such a transform lost some context. So I tried getting gpt to "reason" about the lost context, or even just point it out.

The pre-transformed conversation was still in the context window, but it just couldn't see that version of it. It was completely blind and could only see the "concise professional" version of the conversation.

While trying to debug and find a workaround, I deleted the transformed output. The input still mentioned the transform, but gpt was still absolutely blind to the original conversation, acting as if the transform had still been applied.

It seemed like the simple suggestion of a transform was enough for gpt apply that transform within its internal context. It wasn't until I deleted all mention of a potential transform that gpt regained its ability to see the original "sarcastic mocking GLaDOS" side of the conversation.

diffeomorphism 1003 days ago

English only forces that if there is a definite article "the" (unique composer). If it instead said "a" composer, then it is impossible to answer "who composed" completely; you only know one of the composers.

Jumping to conclusions like "if A then B" to "A=B" is a very common mistake for humans, bad statistics and propaganda. So I am actually positively surprised that models don't make that mistake.

V__ 1003 days ago

I would have anticipated that, with a large enough dataset, the latent space would create graph-like relationships. Encoding things many-to-many, one-to-one etc. To my limited understanding this is a surprising find.

DonaldFisk 1003 days ago

Your examples use the indefinite article, but the first example in the abstract uses the definite article. (The second, after rephrasing, also does.) Contrast "Mars is the fourth planet from the Sun" and "Mars is a planet".

With GOFAI (e.g. Cyc, SHRDLU), you'd distinguish between "X is a Y" and "X is the Y" and store them differently, and if you got an incorrect answer you'd have a good idea where to look for your bug. With a LLM, you have a black box with billions of connexion weights and (correct me if I'm wrong) your only recourse is to retrain it on data which distinguishes the two cases, but even that might get lost in the noise, or cause problems somewhere else.

anonzzzies 1003 days ago

Yeah, that seems just unclear language and because it's trained on human language, 'is' does not equal 'equals'. Using 'equals' will help.

eloisant 1003 days ago

That's the whole problem of LLM: they work only on human language.

Even before computers we created formal languages (mathematics, logic equations) precisely because human language is too often ambiguous.

Spivak 1003 days ago

Don't you lump math in there, math is 99% human language. The symbol pushing you learned in HS is just advanced arithmetic. Math is more like legalese with some very loose additional notation than a formal language.

lkirkwood 1003 days ago

Can you expand on this? Notation like "=" can be written using language but we define an exact meaning for the operator regardless, unlike language.

JieJie 1002 days ago

I suppose that begs the question, what if we trained an LLM only on examples of formal languages?

DonaldFisk 1003 days ago

In the particular cases being discussed, there's no ambiguity: "is a" means "member of" and "is the" means equals.

dragonwriter 1002 days ago

> In the particular cases being discussed, there's no ambiguity: "is a" means "member of" and "is the" means equals.

Yes, and fitting just those cases would result in a model that handled other cases incorrectly, because idioms inconsistent with that rule exist. (“Jodie is the bomb” has a meaning distinct from the individual words taken separately which is not stating a reflexive equivalency, for instance.)

robjan 1003 days ago

I think the key words are "a" vs "the" when you use "a" the relationship is one to many, whereas when you use "the" it's one to one. If I say "Charles is the King" then "the King is Charles" also holds true. If I say "Charles is a King" then I can't conclude that the King is Charles.

beardyw 1003 days ago

So "dogs are animals", does that work?

diffeomorphism 1003 days ago

Yes, "dogs are the animals (e.g. the only animals on this space station)" is implied to be reversible. Indefinite or missing articles like in your example make no such implication.

smusamashah 1002 days ago

There is a plant which mimics leaves of nearby plants. Try asking GPT-4 which plant it is and it will always give you wrong answers. But if you do give it the name of that plant and ask what it is known for, it will tell you that it can mimic leaves of other plants.

This is what their inability to infer A from B is about.

tmalsburg2 1003 days ago

This is a useful observation, but it doesn’t explain the particular example given in the article.

TZubiri 1003 days ago

Not all relations are order independent, so the LLM just assumed none are, prioritizing not being incorrect over being correct.

lordnacho 1003 days ago

Yeah isn't this one of those logic things?

Perhaps what they mean is NotB -> NotA, which often uses a symbol that maybe is being erased?

In any case the abstract seems wrong.

DebtDeflation 1003 days ago

Yes. Modus Ponens vs Affirming The Consequent.

If A then B. A. Therefore, B. -> Valid.

If A then B. B. Therefore, A. -> Not valid.

demondemidi 1003 days ago

Depends on the meaning of the word “is”?

TZubiri 1003 days ago

Rain is wet. Wet is not rain.

robjan 1003 days ago

Wet is an adjective

Scarblac 1003 days ago

Adjectives are wet.

dragonwriter 1002 days ago

Birds are dinosaurs.

Dinosaurs are not birds. At least not generally.

“Birds” and “Dinosaurs” are nouns.

dataflow 1003 days ago

Rain is water. Water is not rain.

drt5b7j 1003 days ago

It depends upon what the meaning of the word "is" is.

ahartmetz 1003 days ago

In this case, whether it's an identity-is or a "is a member of the group".