|
|
|
|
|
by R00mi
57 days ago
|
|
The 56% conversation vs 21% coding split is a really interesting finding — it lines up with trajectory studies on SWE-bench where ~38% of an agent's actions are pure exploration (grep, find, file reads). The remaining "no-tool" turns are likely the agent digesting what it read and planning its next move.
These two costs are linked: the less efficiently the agent localizes, the more thinking turns it needs to piece things together. PatchPilot (ICML 2025) quantified this — localization capability accounts for ~47% of an agent's total improvement.
One thing that would be really interesting in your tool: separating exploration turns (grep/find/read) from pure thinking turns, and seeing how the ratio scales with project size. On large monorepos, exploration should blow up non-linearly. |
|