Hacker News new | ask | show | jobs
by Atomic_Torrfisk 77 days ago
Im sorry, this just sounds like hypespeak. CAn you provide samples?

> once they unlock one capability,

What does it mean to unlock? Its an llm nothing is locked. The output is a as good as the context, model and environment. Nothing is hidden or locked.

2 comments

Maybe unlock means "recognize and solve a problem with an order of magnitude fewer tokens than the first time you did it". The same way humans might spend a lot of time thinking about a certain problem and various ways to solve it, but once they go through that process, and then recognize it again, they don't need to go to the same process and jump right to the solution.
I'll have a stab at this. I'll start with an attempt at justifying the remark that an agent which is a good coder will be good at other tasks.

1. Coding is, as a technical endeavour, relatively difficult (similarly for mathematics). So a model which performs well on this task can be expected to easily handle also-technical-but-slightly-easier tasks, like understanding (musical) harmony theory or counterpoint -- for much the same reason that human programmers/mathematicians/scientist don't struggle to understand those "easier" theories.

2. Reinforcement learning augments a base models ability to excel in something else that's "difficult", namely to "look ahead" and plan multiple steps in advance. That's literally how the training algorithm works, generating multiple paths at once, and rewarding intermediate steps in those paths which succeed in attaining the goal. And that skill, too, is extremely useful in other domains. An AI agent which learns that to break a problem into sub-problems, and then tackle each in turn methodically -- it stands to reason that it can apply that to, say, a business plan.

Note: 1 & 2 are not independent, nor are frontier models' excellence in these domains magical: it ultimately boils down to the availability of massive datasets (in particular for coding) and totally objective metrics (in the case of mathematics: solved math problems). That's the key ingrediant for reinforcement learning to be so effective.

So: the skills are transferrable because they're difficult, and require lots of planning. That models are so good at them is a fluke, and in a parallel world where humans created git repo after git repo of business plans, it might be that which we lean on to teach a reinforcement learning algorithm how to "reason" and "plan".

Now let's turn our attention to the "synergies" aspect, which I agree with. Let's say your agentic model, which is already excellent at reasoning and planning, acquires a new or improved capability which allows it to search the domain space, calculate, etc. much better than before -- this capability can now bear upon the plan, or be factored into the plan. For example, the model might be able to say "I don't need to worry about this particular subproblem for now; I can rely on my "mathematica" capability to deal with it when I absolutely need."

Or to put it differently: monkeys, like humans, are able to use (rudimentary) tools. They'll take a rock, and use it to crack open a coconut (or whatever). But a human being, with far superior reasoning and planning abilities, takes that tool, and uses it to make an even better tool -- and the result after many iterations of this process is civilization as we know it, while monkeys are still stuck trying to crack open nuts with rocks.