| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by cjbprime 1144 days ago

I think that some LLMs (mainly just GPT-4) should be considered as refutations to the Stochastic Parrot idea, which was published in March 2021 and claims no LLM can have "any model of the world". It was a reasonable (though perhaps overconfident) paper for authors who had only used GPT-3 to publish, but there is now ample evidence of world modeling, including published academic evidence, for GPT-4. I think the following claim from the paper is also deeply incorrect and confused:

> an LM is a system for haphazardly stitching together sequences of linguistic forms it has observed in its vast training data, according to probabilistic information about how they combine, but without any reference to meaning: a stochastic parrot.

Next token prediction is a function of tokens/words, but that doesn't preclude that prediction depending on meaning, and the best predictions obviously do depend on meaning. It is not clear, at least to me, that next token prediction leads to any kind of upper bound on intelligence. It is always possible to incorporate more of the descriptions of the world obtained through the training data into your predictions to improve them.

But I think you've missed an important distinction. The stochastic parrot claim can be false, not because LLMs can or will ever feel or be conscious, but because they can (today) reason and solve novel problems (the capability is there, but it is unreliable). LLMs are not probabilistically regurgitating their training sets; they're applying the learning they took away from those training sets.

I think GPT-4 can reason today, but I don't think it can feel or is conscious, and I don't expect it to be capable of those things in its current architecture.