| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by charcircuit 113 days ago

It can learn. When my agents makes mistake they update their memories and will avoid making the same mistakes in the future.

>Reinforcement learning, on the other hand, can do that, on a human timescale. But you can't make money quickly from it.

Tools like Claude Code and Codex have used RL to train the model how to use the harness and make a ton of money.

3 comments

kelnos 113 days ago

That's not learning, though. That's just taking new information and stacking it on top of the trained model. And that new information consumes space in the context window. So sure, it can "learn" a limited number of things, but once you wipe context, that new information is gone. You can keep loading that "memory" back in, but before too long you'll have too little context left to do anything useful.

That kind of capability is not going to lead to AGI, not even close.

link

regularfry 113 days ago

Two things:

1. It's still memory, of a sort, which is learning, of a sort. 2. It's a very short hop from "I have a stack of documents" to "I have some LoRA weights." You can already see that happening.

link

charcircuit 112 days ago

Also keep in mind that the models are already trained to be able to remember things by putting them in files as part of the post training they do. The idea that it needs to remember or recall something is already a part of the weights and is not something that is just bolted on after the fact.

link

charcircuit 113 days ago

>but before too long you'll have too little context left to do anything useful.

One of the biggest boosts in LLM utility and knowledge was hooking them up to search engines. Giving them the ability to query a gigantic bank of information already has made them much more useful. The idea that it can't similarly maintain its own set of information is shortsighted in my opinion.

link

0xbadcafebee 112 days ago

It's simply a fact that LLMs cannot learn. RAG is not learning, it's a hack. Go listen to any AI researcher interviewed on this subject, they all say the same thing, it's a fundamental part of the design.

link

Dansvidania 113 days ago

That’s not learning. That’s carrying over context that you are trusting is correctly summarised over from one conversation to the next.

link

regularfry 113 days ago

Which sounds uncomfortably like human memory, which gets rewritten from one recollection to the next. Somehow, we cope.

link

Dansvidania 112 days ago

I disagree. Human memory is literally changing the weights in your neural network. Like, exactly the same.

So in the machine learning world, it would need to be continuous re-training (I think its called fine-tuning now?). Context is not "like human memory". It's more like writing yourself a post-it note that you put in a binder and hand over to a new person to continue the task at a later date.

Its just words that you write to the next person that in LLM world happens to be a copy of the same you that started, no learning happens.

It might guide you, yes, but that's a different story.

link

0xbadcafebee 112 days ago

Ever seen the movie Memento? That's LLM memory.

link

otabdeveloper4 113 days ago

> they update their memories

Their contexts, not their memories. An LLM context is like 100k tokens. That's a fruit fly, not AGI.

link

charcircuit 113 days ago

A human can't keep 100k tokens active in their mind at the same time. We just need a place to store them and tools to query it. You could have exabytes of memories that the AI could use.

link

otabdeveloper4 113 days ago

> A human can't keep 100k tokens active in their mind at the same time.

Well, that's just, like, your opinion, man.

link

radlad 111 days ago

It's hard to know what this means, but... really? I mean most people can't keep more than 10 digits in their mind at a time.

link