| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by kromem 945 days ago

A calculator does have an excellent understanding of math.

In that case, it's an understanding directly programmed in by developers who have an excellent understanding of math.

In the case of a LLM, there is no direct programming of any understanding, and the version best able to predict next tokens developed its own 'understandings.'

The problem is that when sufficiently complex, we really have no idea just what those understandings are, so it could be "this word often goes after these other words" or "given the context I should be happy and a happy person would say this."

Those are two very different levels of understanding, and while research over the past year has pretty well demonstrated that at least some world modeling in linear representations is occurring, those findings are in toy models and something as complex as GPT-4 is a giant black box where what % if understandings are surface statistics and what % are something more is pretty much a giant question mark.