Hacker News new | ask | show | jobs
by Imnimo 533 days ago
4.2x doesn't mean anything if you don't tell me what "accuracy" Sonnet 3.5 had.
1 comments

I agree. Our local early results were promising were a higher percentage of code change requests produced a functionally correct output. We will post more metrics and data in the future.