Y
Hacker News
new
|
ask
|
show
|
jobs
by
DarkNova6
90 days ago
I'm never sure how much faith one can put into such benchmarks but in any case the optics seem to shift once you have pass@2 and pass@3.
Still, the more interesting comparison would be against something such as Codex.