Hacker News new | ask | show | jobs
by ssgodderidge 40 days ago
The example model in the documentation is 4o-mini, you might want to update that to a more recent model.

As an aside, 4o-mini came out months before agent skills were released… I’m curious how it performs with choosing to load skills in the first place?

2 comments

It’s an artifact of the documentation being AI generated, they usually pick gpt4-era models, without giving it further thought.

For Gemini it seems to always pick 2.5 despite 3.1 being the latest, Claude the 3.5-era models.

Not sure what’s preventing AI labs on ensuring this stuff is refreshed during training.

I was wondering the same and learned the model doesn't know about itself during training [0]

[0] https://developers.googleblog.com/closing-the-knowledge-gap-...

the model doesn't know itself, but all these larger models are generating a significant amount of synthetic data from the prior models, and the prior models are all context bloated renditions; you fill the KV cache with whatever alignment you want, and then generate synthetic data.

That training on existing models is what brings out various other things about other models; then there's models that are just like snowballs, where you build one iteration, then you give it it's identity, then you train on that with the same synthetic generaiton.

So a model could generation include at some point it's own name.

I don’t think what you’re saying makes a lot of sense. You don’t “fill the KV cache with whatever alignment you want.” That doesn’t exist. The KV cache is an inference optimization, and is populated by running tokens through the model.

Synthetic data is generated by other models, and yes this is often where identity propagates.

I think with the snowballing you mean things like iterative self distillation? That’s definitely not done unsupervised, because of the risk of model collapse, and typically heavily curated and/or mixed with real data.

The skill is deterministically added to the prompt by the harness before the target model is invoked. There is no “choosing” to load a skill. You might be confusing skills with tools (MCP etc).
The metadata is loaded by the harness, but the LLM still needs to choose to load the rest of the skill, no?
You are correct. I'm not sure what the parent is trying to say.
Define “load.” It follows the instructions in the prompt - its natural behavior.
I was using the term as you used in your comment. I believe the official term is "Activation" however:

> Activation: When a task matches a skill’s description, the agent reads the full SKILL.md instructions into context.[1]

> Full instructions load only when a task calls for them, so agents can keep many skills on hand with only a small context footprint.

[1]: https://agentskills.io/home#how-do-agent-skills-work

Ah, I misunderstood this, thanks for the link. You are correct. I was assuming this system worked like CLAUDE.md in that it was deterministically added to the context without the LLM choosing to add it. My mistake.
Concretely, it has to decide whether it is in a circumstance where that skill is useful, pull the instructions into the context and follow them.
Yep, and as with any other instructions, it can sometimes not pull the skill even if the trigger conditions are there.
it depends on the harness. opencode appears to prompt the models with tools and skills when answering questions.