| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by throwaway482945 759 days ago
	Does anyone truly understand these models? I don't think we have any proofs about the upper limits of what LLMs are capable of. How can you be so confident? To be clear, I am not saying there are no limits to what LLMs can do, I just don't get how people can be so sure one way or the other. Especially when you consider that this technology is evolving at such an unpredictable pace.

2 comments

appplication 759 days ago

We do actually understand generally well enough what is happening. Attention isn’t some mysterious unexplained mechanism. We know how it works and why. When people describe these models as a black box, they typically mean that there are too many layers and weights to explain to you exactly why it chose, for example, a specific sequence of words. But we can certainly explain exactly why it would chose some sequence, and why that sequence would be expected to be relevant.

Simplifying a bit, but attention provides a way for the model to build context on one word based on how often it is seen with others. It doesn’t have a concept of correct or incorrect. It doesn’t have a concept of reasoning.

What is impressive is that even without these concepts of correctness and reasoning, the model can still perform quite well on tasks where correctness and reasoning would be expected. But this is more a statement on the corpus of knowledge and the power of language in general than it is on the models capabilities itself. It’s important not to confuse the ability to seem correct and seem well reasoned with any actual mechanism to do so.

link

ofrzeta 757 days ago

> We do actually understand generally well enough what is happening.

See the comment on the "Golden Gate Bridge" version of Claude:

"The fact that we can find and alter these features within Claude makes us more confident that we’re beginning to understand how large language models really work." (emphasis mine)

https://www.anthropic.com/news/golden-gate-claude

link

opprobium 759 days ago

Recent example of a proof regarding theoretical limitations of Transformers: https://aclanthology.org/2023.tacl-1.31.pdf (also extended to cover SSMs https://arxiv.org/pdf/2404.08819)

link

mewpmewp2 759 days ago

I'm not sure if this paper corresponds to limits on what it can answer with a single or few tokens, but also the limits where LLM itself is allowed to produce more tokens (chain of thought) as well as use tools (coding) to solve problems?

link