|
|
|
|
|
by sho
57 days ago
|
|
Taking the article's 5% accuracy improvement at face value: if true, then it's more than worth the token inflation IMO. That's because of tool call chains, where errors compound and accumulate, and small improvements in accuracy get greatly magnified. Again, the article's numbers are likely a rather crude approximation, but taking 85% accuracy (claude 4.6) vs 90% (4.7) as inputs: 4.6 1 iteration 85%
4.7 1 iteration 90%
4.6 5 iterations 44.37%
4.7 5 iterations 59.85%
4.6 10 iterations 19.69%
4.7 10 iterations 34.87%
Compounded, small improvements really move the needle downstream. 1.4x doesn't seem worth it for 5% better, but 10 calls in, that's more than a 40% improvement. |
|