Hacker News new | ask | show | jobs
by lottaFLOPS 317 days ago
related research that was also announced this week: https://www.textquests.ai/
3 comments

They seem to be going for a much simpler route of just giving the LLM a full transcript of the game with its own reasoning interspersed. I didn't have much luck with that, and I'm worried it might not be effective once we're into the hundreds of turns because of inadvertent context poisoning. It seems like this might indeed be what happens, given the slowing of progress indicated in the paper.
Very interesting how they all clearly suck at it. Even with hints, they can't understand the task enough to complete the game.
that's a great tracker. How often is the laderboard updated?