|
|
|
|
|
by felix9527
90 days ago
|
|
The study only looks at what lands in the PR. In my experience a single prompt can trigger 20+ tool calls, most of them reads and greps. The final diff is a tiny fraction of what actually happened. Hard to judge quality without seeing the process. |
|