Y
Hacker News
new
|
ask
|
show
|
jobs
by
skysniper
81 days ago
thanks for the info. before running the bench i only tried it in arena.ai type of tasks and it was not impressive. i didn't expect it to be that good at agentic tasks