| The paper is here - https://arxiv.org/pdf/2603.19461 This, IMO is the biggest insight into where we're at and where we're going: > Because both evaluation and self-modification are coding tasks, gains in coding ability can translate into gains in self-improvement ability. There's a thing that I've noticed early into LLMs: once they unlock one capability, you can use that capability to compose stuff and improve on other, related or not, capabilities. For example "reflexion" goes into coding - hey, this didn't work, let me try ... Then "tools". Then "reflxion" + "tools". And so on. You can get workflows that have individual parts that aren't so precise become better by composing them, and letting one component influence the other. Like e2e coding gets better by checking with "gof" tools (linters, compilers, etc). Then it gets even better by adding a coding review stage. Then it gets even better by adding a static analysis phase. Now we're seeing this all converge on "self improving" by combining "improving" components. And so on. This is really cool. |