Hacker News new | ask | show | jobs
Zork-bench: An LLM reasoning eval based on text adventure games (lowimpactfruit.com)
2 points by nicholasjbs 45 days ago