Hacker News new | ask | show | jobs
by aoeusnth1 30 days ago
The model is the thing which is learned in order to make the probabilistic prediction with low entropy.
1 comments

Well this is probably the same kind of semantic trap she's fighting with. Yes, you're right it's a model. The distinction is that they models of _language_ and not thoughts or feelings.
When I read your reply, I’m also modeling language. Tokens are just the discretization of the model’s eyes and ears. My brain does a huge amount of work to represent what’s happening in the world based on discrete information received from the outside world, just like language models do.
Sure but you've also probably formed a model of who I am and what I'm thinking and formulated a response that isn't just grammatical and relevant but designed to provoke an outcome.
We're discussing whether they are models or not, not whether they have goals and agency. A language model does form a model of who you are and what you're thinking, because language is causally connected to those aspects of the generating distribution and modeling those aspects reduces cross-entropy.

RL provides the goals and agency. Pretraining provides the model.