| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by MrScruff 1096 days ago

We're able to do something analogous to reinforcement learning (take on new example data to update our 'weights').

Why do I spend time debating these ideas on Hacker News? Probably the underlying motivation is improving the reliability of my model of the world, which over my lifetime and the lifetimes of creatures before me has led to (somewhat indirectly) positive outcomes in survival and reproduction.

Is my model of the world that different to that of an LLM? I'm sure it is in many ways, but I expect their are similarities as well. An LLMs model encodes in a form a bunch of higher order relationships between concepts as defined by the word embedding. I think my brain encodes something similar, although the relationships are probably orders of magnitude more complex than the relationships encoded with GPT-4.

1 comments

chongli 1096 days ago

Is my model of the world that different to that of an LLM?

Well, one major way you’re different from an LLM is that you’re alive. You’re capable of learning continuously as you go about your day and interact with the world. LLMs are “dead” in the sense that they’re trained once and frozen, to be used from then on in the exact same state of their initial training.

link

MrScruff 1096 days ago

I agree that is a fundamental difference. That’s what I meant about reinforcement learning. Our ‘model weights’ are being updated with new data all the time.

I was just referring to what happens at a specific instance in time when someone asks me for example ‘What’s the capital of Norway?’

link

chongli 1096 days ago

That one’s not a great example. Either you know the capital or you don’t. There’s no process (other than research) by which you can learn the name while attempting to answer.

A question I get much more often is “how do I solve this math problem?” Many times, the problem is one I’ve never seen before. So in the process of answering the question, I also learn how to solve the problem too.

link

ItsMonkk 1096 days ago

While you can apply zero shot learning and get the answer to a new math problem, you are only apply the learning to significant depth after a fine-tuning session - sleep.

link