Hacker News new | ask | show | jobs
by hdjdbdirbrbtv 372 days ago
Are you talking about teaching in the context window or fine tuning?

If it is the context window, then you are limited to the size of said window and everything is lost on the next run.

Learning is memory, what you are describing is an llm being the main character in the movie Momento, I.e. no longterm memories past what was trained in the last training run.

1 comments

There's really no defensible way to call one "learning" and the other not. You can carry a half-full context window (aka prompt) with you at all times. Maybe you can't learn many things at once this way (though you might be surprised what knowledge can be densely stored in 1m tokens), but it definitely fits the GP's definition of (1) real-time and (2) based on a few examples.
Yes, one is committing knowledge to neurons, the other is commuting knowledge to short term memory.

Put another way, if you took a llm with random weights. Do you expect you could rely on context alone?