Y
Hacker News
new
|
ask
|
show
|
jobs
by
HDBaseT
3 days ago
Aren't benchmarks exactly that?
We used the AI to solve given problem with x% adherence/quality/correctness?