Hacker News new | ask | show | jobs
by fragmede 161 days ago
Call them a "hack" all you want, they seem to work. What's particularly intesting is how claude has been trained on skills, so it doesn't need to be taught how to use a skill, so that's been baked into it.
1 comments

I'm not claiming they don't work in some sense, but as a user you have to be fairly deeply aware of how they work, context engineering is A Thing, you have to tell LLMs to remember stuff, etc.

We're hacking around the fact that the models don't learn in normal use. That's in no way controversial.

A model that continuously learnt would not need the same sort of context engineering, external memory databases, etc.

You speak the truth but looking back, what I reacted to is

> It's clear that we need a paradigm shift on memory to unlock the next level of performance.

and my take is that we might not need to get there to get the next level of performance, based on how well the latest models are able to utilize these hacks of a memory feature. On top of that, Claude was specifically RLHF'd to have the skills concept, so it's good with those. We disagree. Let's let time see who ends up being right.