Hacker News new | ask | show | jobs
by jermaustin1 14 days ago
I'm not a very smart person, so take what I say with a grain of salt.

I think the path forward will have agents that use models that are individually specialized tasks (some might use a bigger model, some might use smaller models), then orchestrators that are good at knowing when to use which agent type.

I've played around with this in my own tiny coding agents, for TTRPG NPCs, and even a small experiment where LLMs controlled a MUD client as an NPC that played the game with you (only 5 rooms in the experiment).

Basically, break the tasks down into chunks so you don't have to use generalist models for everything, and can chose the right model for the job.

I'm also running all of this locally, where a generalist foundation model doesn't work, and heavily quantized models don't perform well for all tasks, so for unlimited token budgets, my solution is probably overkill.

1 comments

"Orchestrator" pattern, "only use a big model to do big thinking, use smaller models to do grunt work" is probably what the field would converge to, eventually. Perhaps in form of "dynamic sparsity" - i.e. a family of closely related models allowing inference to transition from 1B class to 100T class on a dime, complete with something like joint KV cache.

But it's a hard pattern to pull off, so I'm not sure how soon we'll see it in action.