Hacker News new | ask | show | jobs
Zork-bench: An LLM reasoning eval based on text adventure games (lowimpactfruit.com)
5 points by mnky9800n 64 days ago