Y
Hacker News
new
|
ask
|
show
|
jobs
Beyond Benchmark Maxxing: Measuring Open Source Models as Real-World Agents
(
ultravox.ai
)
1 points
by
zkoch
298 days ago