Hacker News new | ask | show | jobs
by chongli 898 days ago
You mean an LLM-based adventure game? There’s AI Dungeon [1]. Or do you mean a more sophisticated conventional parser with a hand-crafted adventure? That I’m not aware of.

From what I know about adventure games with parsers, they’re essentially just a big finite state machine with descriptions for each state and a hard-coded list of commands that trigger state transitions. There may be additional commands that print a whimsical or humorous response but otherwise leave the player in exactly the same state. There’s also usually a default response for when the input can’t be parsed into a valid command.

The AI Dungeon approach, as far as I’m aware, isn’t really a game with rules and a reachable end condition. It’s just an LLM trained on fantasy role-playing description so it can produce a lot of stuff in response to any prompt. What it lacks, as far as I’m aware, is the underlying state machine.

[1] https://play.aidungeon.com/

1 comments

The latter. As you say, adventure games are basically state machines as I understand it. (I know a bunch of the Infocom folks but never dug into the technical details.)

I guess I don’t have a clear idea of exactly what I’m asking for aside from a general feeling that you could create more natural feeling interactions with today’s technologies. But I’m not sure what that would look like. I should chat with one of my Infocom friends one of these days.

I think it would just take a ton of work. If you’re not willing to make the leap to AI (and non-determinism) then conventional technology doesn’t have much to offer for NLP (which is what the task ultimately boils down to).

The trouble with trying to build a deterministic, logic-based NLP for interacting in a game world is essentially a special case of the frame problem [1] in AI.

[1] https://en.wikipedia.org/wiki/Frame_problem

I'm guessing it would be like a fever dream to let an LLM be the dungeon master in isolation. Things would change after the fact in weird ways, especially when the context grows too large.

But what about coupling an LLM with a physics simulator and a 3D world model?

You still interact with the LLM in a text interface, but hidden conversations take place with the simulator where the LLM can interrogate the current state of the 3D world simulator to describe it to the player. You could even do this using GPT4-Vision to interpret rendered images. When the player performs an action, it is translated into "physical" actions into the 3D world simulator which updates its state.

It feels like someone should have done this already?