Hacker News new | ask | show | jobs
by TeMPOraL 1098 days ago
> There's two hypotheses for how LLMs generate apparently "thought-expressing" outputs: Hyp1 -- it's sampling from similar text which is distributed so-as-to-express a thought by some agent; Hyp2 -- it has the capacity to form that thought.

There's also another hypothesis: Hyp3 -- that Hyp1 and Hyp2 converge as the LLM is scaled up (more training data, more dimensions in the latent space), and in the limit become equivalent.

1 comments

They're indistinguishable via naive measurement (prompting) if the LLM can sample from all possible data: there's a very large infinity of (Q, A, time) triples (ie., it's real-valued).

But it cannot, since most of those are in the future.