Hacker News new | ask | show | jobs
by segh 445 days ago
Claude Plays Pokemon is one person's side project to see how well Sonnet can play pokemon. It is a neat LLM benchmark; it's not a serious attempt at making Pokemon-playing AI.
1 comments

It may not be serious, but it's a true display of an LLMs limitations. A bad look for Claude, and a missed advertising opportunity if someone can do better.