Y
Hacker News
new
|
ask
|
show
|
jobs
OpenGameEval: Eval Framework to Benchmark Agentic AI Assistants
(
corp.roblox.com
)
7 points
by
moneil971
188 days ago
1 comments
moneil971
188 days ago
OpenGameEval offers a unique testing ground to evaluate core model capabilities related to agenetic reasoning and long-horizon task solving.
link