Hacker News new | ask | show | jobs
by hellodanylo 1232 days ago
Yeah, I am also struggling to interpret the metrics in this post positively.

The 50% success rate is also best out of 3200 completions. For best out of 1 completion, the success rate is in low single digits.

I think the lesson here is that these models bring a lot more value when: 1. you have unit tests, 2. can afford compute/time to let the model try many solutions, 3. have enough isolation to run unverified code.