Hacker News new | ask | show | jobs
by fragmede 162 days ago
with beads, or shoving it in git, or .MD files, it's not clear that we do.
1 comments

These are all very much in the same category of hacks that I mentioned.

A cat doesn't know its way around a house when it's born, but it also doesn't have to flick through markdown files to find its way around. A child can touch a hot stove once and be neurotic about touching hot things for the rest of their life, without having to read flash cards each morning or think for a few minutes about "what do I know about stoves" every time they're in the kitchen.

Call them a "hack" all you want, they seem to work. What's particularly intesting is how claude has been trained on skills, so it doesn't need to be taught how to use a skill, so that's been baked into it.
I'm not claiming they don't work in some sense, but as a user you have to be fairly deeply aware of how they work, context engineering is A Thing, you have to tell LLMs to remember stuff, etc.

We're hacking around the fact that the models don't learn in normal use. That's in no way controversial.

A model that continuously learnt would not need the same sort of context engineering, external memory databases, etc.

You speak the truth but looking back, what I reacted to is

> It's clear that we need a paradigm shift on memory to unlock the next level of performance.

and my take is that we might not need to get there to get the next level of performance, based on how well the latest models are able to utilize these hacks of a memory feature. On top of that, Claude was specifically RLHF'd to have the skills concept, so it's good with those. We disagree. Let's let time see who ends up being right.