| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ssgodderidge 40 days ago
	The example model in the documentation is 4o-mini, you might want to update that to a more recent model. As an aside, 4o-mini came out months before agent skills were released… I’m curious how it performs with choosing to load skills in the first place?

2 comments

stingraycharles 40 days ago

It’s an artifact of the documentation being AI generated, they usually pick gpt4-era models, without giving it further thought.

For Gemini it seems to always pick 2.5 despite 3.1 being the latest, Claude the 3.5-era models.

Not sure what’s preventing AI labs on ensuring this stuff is refreshed during training.

link

simonpure 40 days ago

I was wondering the same and learned the model doesn't know about itself during training [0]

[0] https://developers.googleblog.com/closing-the-knowledge-gap-...

link

cyanydeez 39 days ago

the model doesn't know itself, but all these larger models are generating a significant amount of synthetic data from the prior models, and the prior models are all context bloated renditions; you fill the KV cache with whatever alignment you want, and then generate synthetic data.

That training on existing models is what brings out various other things about other models; then there's models that are just like snowballs, where you build one iteration, then you give it it's identity, then you train on that with the same synthetic generaiton.

So a model could generation include at some point it's own name.

link

stingraycharles 39 days ago

I don’t think what you’re saying makes a lot of sense. You don’t “fill the KV cache with whatever alignment you want.” That doesn’t exist. The KV cache is an inference optimization, and is populated by running tokens through the model.

Synthetic data is generated by other models, and yes this is often where identity propagates.

I think with the snowballing you mean things like iterative self distillation? That’s definitely not done unsupervised, because of the risk of model collapse, and typically heavily curated and/or mixed with real data.

link

block_dagger 40 days ago

The skill is deterministically added to the prompt by the harness before the target model is invoked. There is no “choosing” to load a skill. You might be confusing skills with tools (MCP etc).

link

ssgodderidge 40 days ago

The metadata is loaded by the harness, but the LLM still needs to choose to load the rest of the skill, no?

link

albedoa 40 days ago

You are correct. I'm not sure what the parent is trying to say.

link

block_dagger 40 days ago

Define “load.” It follows the instructions in the prompt - its natural behavior.

link

ssgodderidge 40 days ago

I was using the term as you used in your comment. I believe the official term is "Activation" however:

> Activation: When a task matches a skill’s description, the agent reads the full SKILL.md instructions into context.[1]

> Full instructions load only when a task calls for them, so agents can keep many skills on hand with only a small context footprint.

[1]: https://agentskills.io/home#how-do-agent-skills-work

link

block_dagger 40 days ago

Ah, I misunderstood this, thanks for the link. You are correct. I was assuming this system worked like CLAUDE.md in that it was deterministically added to the context without the LLM choosing to add it. My mistake.

link

hyperpape 40 days ago

Concretely, it has to decide whether it is in a circumstance where that skill is useful, pull the instructions into the context and follow them.

link

cassianoleal 40 days ago

Yep, and as with any other instructions, it can sometimes not pull the skill even if the trigger conditions are there.

link

cyanydeez 39 days ago

it depends on the harness. opencode appears to prompt the models with tools and skills when answering questions.

link