The old Infocom parsers were quite advanced and support much more advanced grammar than I think I ever considered using in playing a game like that.
Later Infocom games, and later hobby interactive-fiction, has conventions like allowing some abbreviations (x for examine, l for look, i for inventory). My transcripts are full of those. Even if the parser will probably correctly parse a sentence like "I want to look at the table, please", in practice most of us will just type "x table" anyway.
The parser also allowed for combined commands, at least in somewhat later games, so you could type things like (I think) "examine the table and then pick up everything that is on it. then ask the dwarf to go north" as a single command (but that is still going to be split up into several actions, as if they had been entered as several shorter commands). I never learned enough about the parser to dare try to use things like that as I am not quite sure what the parser would understand or not. Easier to just input one thing at a time and see what happens before typing in the next thing.
While the parsers can be impressive, I just as happily go back to play the more simpler games like the ones by Scott Adams that used a two-word parser and only read the first three letters of each word. You really don't need more than that for good player input. That user interface is much easier to use as there are fewer things you may have to try to make the parser understand what you want to do.
You mean an LLM-based adventure game? There’s AI Dungeon [1]. Or do you mean a more sophisticated conventional parser with a hand-crafted adventure? That I’m not aware of.
From what I know about adventure games with parsers, they’re essentially just a big finite state machine with descriptions for each state and a hard-coded list of commands that trigger state transitions. There may be additional commands that print a whimsical or humorous response but otherwise leave the player in exactly the same state. There’s also usually a default response for when the input can’t be parsed into a valid command.
The AI Dungeon approach, as far as I’m aware, isn’t really a game with rules and a reachable end condition. It’s just an LLM trained on fantasy role-playing description so it can produce a lot of stuff in response to any prompt. What it lacks, as far as I’m aware, is the underlying state machine.
The latter. As you say, adventure games are basically state machines as I understand it. (I know a bunch of the Infocom folks but never dug into the technical details.)
I guess I don’t have a clear idea of exactly what I’m asking for aside from a general feeling that you could create more natural feeling interactions with today’s technologies. But I’m not sure what that would look like. I should chat with one of my Infocom friends one of these days.
I think it would just take a ton of work. If you’re not willing to make the leap to AI (and non-determinism) then conventional technology doesn’t have much to offer for NLP (which is what the task ultimately boils down to).
The trouble with trying to build a deterministic, logic-based NLP for interacting in a game world is essentially a special case of the frame problem [1] in AI.
I'm guessing it would be like a fever dream to let an LLM be the dungeon master in isolation. Things would change after the fact in weird ways, especially when the context grows too large.
But what about coupling an LLM with a physics simulator and a 3D world model?
You still interact with the LLM in a text interface, but hidden conversations take place with the simulator where the LLM can interrogate the current state of the 3D world simulator to describe it to the player. You could even do this using GPT4-Vision to interpret rendered images. When the player performs an action, it is translated into "physical" actions into the 3D world simulator which updates its state.
It feels like someone should have done this already?
I also wrote some toy "interactive fiction" things (with less sophisticated parsers) in python and Lua as a way to gain familiarity with those languages, not that they are very interesting in and of themselves, though they demonstrate a fairly standard technique behind these kinds of games in a compact way.
I was playing some classic text adventures recently, and what I felt would be straightforward and helpful to improve existing games would be if there was a wrapper that would probe the code to understand all valid commands e.g. "get lamp" and then use vector embeddings so that "grab light", "take gaslight" or other basically equivalent phrases wouldn't give you that immersion shattering response of "There is no <light> visible" or "I do not understand <take>" which makes the parser seem dumb sometimes.
It could fix typos at the same time (which I think some games and engines do to some extent already)
Later Infocom games, and later hobby interactive-fiction, has conventions like allowing some abbreviations (x for examine, l for look, i for inventory). My transcripts are full of those. Even if the parser will probably correctly parse a sentence like "I want to look at the table, please", in practice most of us will just type "x table" anyway.
The parser also allowed for combined commands, at least in somewhat later games, so you could type things like (I think) "examine the table and then pick up everything that is on it. then ask the dwarf to go north" as a single command (but that is still going to be split up into several actions, as if they had been entered as several shorter commands). I never learned enough about the parser to dare try to use things like that as I am not quite sure what the parser would understand or not. Easier to just input one thing at a time and see what happens before typing in the next thing.
While the parsers can be impressive, I just as happily go back to play the more simpler games like the ones by Scott Adams that used a two-word parser and only read the first three letters of each word. You really don't need more than that for good player input. That user interface is much easier to use as there are fewer things you may have to try to make the parser understand what you want to do.