Hacker News new | ask | show | jobs
Claude-3.7 outperforms other models in realtime Super Mario Bros (x.com)
3 points by snyhlxde 481 days ago
1 comments

AI gaming agents perform surprising well in real time, and very easy to deploy

We see a very scalable way of AI evaluations ahead. Check out how we did it!