| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by lottaFLOPS 317 days ago
	related research that was also announced this week: https://www.textquests.ai/

3 comments

kqr 317 days ago

They seem to be going for a much simpler route of just giving the LLM a full transcript of the game with its own reasoning interspersed. I didn't have much luck with that, and I'm worried it might not be effective once we're into the hundreds of turns because of inadvertent context poisoning. It seems like this might indeed be what happens, given the slowing of progress indicated in the paper.

link

1970-01-01 317 days ago

Very interesting how they all clearly suck at it. Even with hints, they can't understand the task enough to complete the game.

link

abraxas 317 days ago

that's a great tracker. How often is the laderboard updated?

link