|
|
|
Show HN: SurvivalIndex – which developer tools do AI agents choose?
(survivalindex.org)
|
|
1 points
by scalefirst
102 days ago
|
|
We've been running coding agents against standardized repos
with natural-language prompts — no tool names, no hints —
and measuring what they actually choose. Early finding: Claude Code picks Custom/DIY in 12 of 20
categories. Not because it can't use the tools (BFCL scores
suggest it can) but because it doesn't reach for them.
That's a different failure mode than capability benchmarks
measure. We score each tool on: agent visibility, pick rate vs
Custom/DIY, cross-context breadth, expert human ratings,
and implementation success rate. Tools above survival=1
persist. Below it, agents synthesize around them. Methodology is at survivalindex.org/methodology. Very
curious what people think of the measurement approach,
especially the human coefficient variable. |
|