Hacker News new | ask | show | jobs
by u_name 1141 days ago
The title of the article is kinda the answer at the same time. Chatbots don't know what stuff is. They have no ability to gain knowledge out of learned text, just counting occurences of words in texts and giving them a weight, depending on the relationship in that text. They are just putting combinations of text together.

And the concept of negating something related to something else kinda needs an understanding of the topic at hand.

3 comments

It's a bit bizarre (but also very intriguing!) that you would bring up some of the older techniques of NLP, such as TFIDF, Latent Semantic indexing, GloVe, etc (at least that's what I assume you mean when you mention the severely outdated cooccurence type of models) when these clearly don't use any of that. Transformers have been hyped like crazy lately due to all of these advances, so why being up cooccurence unless you are knowledgeable about the older techniques... Which would mean you should know of the advances.

Anyway, if you do actually know about NLP, I would highly suggest looking at some of the recent work in GNNs (and obviously of Viswani 2017, etc - but you should've gotten that through hype). Transformers are GNNs (somewhat trivial ones, as they are sheaf NNs, but nonetheless) and GNNs are dynamic programmers, which has been shown via category theory (Velolickovic etc al). Hence, GNNs align with algorithmic reasoning, so in a way there is a proof already in the papers mentioned that these systems do reason (there's several, which are easy to find given what I've mentioned). Also, a group in Microsoft has a working on arxiv detailing the many different types of reasoning there are, and how GPT4 does on each type - spoiler - it's for the most part >80% on all the benchmarks, and does only about 6% lower than humans.

So all in all, your claims aren't really supported. If you want to hold the same sentiment of your statement though, you could say we're asking the wrong questions. That's probably true somehow, and will probably be where people will retreat to / move goal posts on next.

"Transformers are GNNs (somewhat trivial ones, as they are sheaf NNs, but nonetheless) and GNNs are dynamic programmers, which has been shown via category theory (Velolickovic etc al)."

In which paper was this demonstrated?

It's literally the title. Peter Velockovic is probably the most prolific author of this era regarding LLMs, and does not yet realize it. https://arxiv.org/pdf/2203.15544.pdf
> They have no ability to gain knowledge out of learned text, just counting occurences of words in texts and giving them a weight, depending on the relationship in that text. They are just putting combinations of text together.

No, they use deep neural networks to build a hierarchical semantic model. They are not simple occurrence counters.

Also the current state of the art of LLMs handles negation easily. This article is outdated.

Here's an example from https://openai.com/research/language-models-can-explain-neur...

"Seriously, you guys. I think I found the Mobile Leprechaun from '06. He's been hiding right in front of our eyes."

Token: hiding

layer 0: “verbs in gerund form (ending in 'ing')”

layer 2: “words related to hiding, concealment, or enclosed spaces”

layer 4: “words related to mental states, particularly anxiety and stress”

layer 17: “words and phrases related to silence or quietness”

All that can be supplied to a LLM for training is syntax. There is no way to provide semantics, it only understands 'table' in regard to syntax it has already seen including that particular token. It has no experience and therefore no understanding of a real table.

It may internally construct a hierarchy as you set out, but this is and can only be a syntactical hierarchy - though should be no surprise that it corresponds to our usual semantic hierarchy. But whereas our syntax proceeds from our semantics, its syntax proceeds only from our syntax that we've fed it.

That's a philosophical issue, not technical.

No one is saying these models are conscious or have human awareness of concepts.

It mechanically builds a deeply layered semantic model that correlates to our human understanding.

Quibbling over whether it is "real semantics" or not is just ironically quibbling over semantics. Yes its not conscious, but it doesn't need to be. It is possible to build a mechanical structure that correlates to a human understanding of the world and performs useful tasks that require only mechanical understanding and reasoning, without consciousness or emotions.

The distinction between semantics and syntax is pretty tight, no philosophy required. The former considers the domain being represented, whereas the latter is strictly the symbols used in the representation.

So to be precise it mechanically builds a deeply layered syntactic model. LLMs just regurgitate syntax, any semantics can only be imagined by us and overlaid on the syntactic results produced.

You are disagreeing philosophically about what constitutes "true" semantics. That is not a technical argument.

If you are right about this, then you should edit or delete this wikipedia article and publish a paper to inform all NLP researchers that there is no such thing as a semantic similarity metric because NLP models cannot understand "true" semantics.

https://en.wikipedia.org/wiki/Semantic_similarity

Does a definition of a word in a dictionary provide the syntax or the semantics of the word it defined?
In common use by humans, it provides semantics, in terms of other previously understood semantics. However, if there is no semantic understanding by the reader, then all it provides is a syntactic rewrite rule for each word.
Found someone who hasn’t tried GPT4.