Hacker News new | ask | show | jobs
by lukev 463 days ago
I don't disagree, but I think we need to be careful of our vocabulary around the word "model." People are starting to use it to refer to the whole "AI system", rather than the actual transformer model.

This article is talking about models that have been trained specifically for workflow orchestration and tool use. And that is an important development.

But the fundamental architectural pattern isn't different: You run the model in some kind of harness that recognizes tool use invocations, calls to the external tool/rag/codegen/whatever, then feeds the results back into the context window for additional processing.

Architecturally speaking, the harness is a separate thing from the language model. A model can be trained to use Anthropic's MCP, for example, but the capabilities of MCP are not "part" of the model.

A concrete example: A model can't read a webpage without a tool, just like a human can't read a webpage without a computer and web browser.

I just feel like it's important to make a logical distinction between a model and the agentic system using that model. Innovation in both areas is going to proceed along related but different paths.

1 comments

While I appreciate the distinction you're pointing out, I disagree with your conclusion that the agentic system and its environment will remain separate going forward. There are strong incentives to merge the external environment more closely with the model's environment. I can imagine a future where GPUs have a direct network interface an os-like engine that allows them to interoperate with the external environment more directly.

It seems like a natural line of progress as RL is becoming mainstream for language models; if you can build the verifier into the GPU itself, you can drastically speed up training runs and decrease inference costs.