Hacker News new | ask | show | jobs
by MyFirstSass 755 days ago
I've been curious as to when games would implement any kind of these new technologies, but i think they are simply too slow for now?

I think we're at least 10-15 years from being able to run low latency agents that "rag" themselves into the games they are a part of, where there are 100's of them, some of them NPC's other's controlling some game mechanic or checking if the output from other agents is acceptable or needs to be run again.

At the moment a macbook air 16 gb can run Phi-Medium 14gb, which is extremely impressive, but it's 7 tokens per second, way to slow for any kind of gaming, you need to 100x performance and we need 5+ generations before i can see this happening.

Unless there's some other application?

7 comments

> for games: i think they are simply too slow for now?

I think it's two-fold. The primary one is that it's likely very difficult to maintain a designers storyline vision and desired "atmosphere / feel", because LLM's currently "go off the rails" too easily. The second is that the teams with enough funding to properly fine-tune generative AI to do dialog, level/environment-creation, character-generation, etc. that funding means they're generally making AAA or AAA-adjacent games, which already need so much of a consumer GPU VRAM that there's not a lot left over for large ML models to run in parallel.

I do think though that we should already be seeing indie games doing more with LLM's and 3D character/level/item generation than we are. Of course AI Dungeon has been trailblazing this for a long time but I just expected to see more widely-recognized success by now from many projects. I take this as a signal that it's hard to make a "good" game using AI generation. If anyone has any suggestions for open-world games with significant amount of AI generation that allows player interaction to significantly affect the in-game universe, I'd be very interested in play-testing them. Can be any genre / style / budget. I just want to see more of what people are accomplishing in this space.

My hope is that there will be space for both the current style of game where every aspect is created/designed by a human, as well as for games of various types where the world is given an overall narrative/aesthetic/vision by the creators, but the details are implemented by AI and allows true open-world play where you finally can just walk into any shop and use RAG/etc to allow complete continuity over months/years of play where characters remember your conversations/interactions/actions of you and anyone playing in the same world.

I do think there's something of an "end-game" for this where a game is released that has no game at all in it, but rather generates games for each player based on what they want to play that day, and creates them as you play them. But I'd like to imagine that this won't replace other games (even if it does take a bit of the air out of the room), but rather exist alongside games with human-curated experiences.

I think any NPC with dialogue important to a goal (a quest, a tutorial, etc) is going to be hard to use generative AI for. It not only needs to be coherent with the story, but it needs to correctly include certain ideas. I.e. if the NPC gives a quest to go find some item at some location, it needs to say what the item is and where it is.

I think we're currently stuck in a local minima where AI isn't up to the task of making a coherent player-interactable world, but an incoherent or fragmented and non-interactable world isn't impressive enough (like No Man's Sky).

Agreed for current systems. I’m sure we’ll get models in the future which will facilitate this but for now LLMs don’t really stay on task like a professional human would.

And even in AI Dungeon the AI plays so fast and loose that it breaks immersion. Like if I’m doing a space trading roleplay, it doesn’t consider things like making sure the product I’m buying selling meets a specific spec, and often a vendor will start offering to buy Product X from me while I’m negotiating purchasing Product X from them. This "type" of continuity problem happens constantly in AI dungeon.

We’re just not there yet, but I have confidence we’ll get there. I think it’s possible even with our current model/training paradigms but we aren’t using RLHF for game applications yet.

I totally think we'll get there, I just don't think we're there yet.

I really think the next step is a heavily AI-integrated version of D&D where the DM can serve as a "filter" for some of the more unhinged output (where appropriate; an intentionally incoherent goblin with some text-to-speech could be phenomenal).

I think that's about where we're at, and I'm expecting a wave of "AI-enhanced" D&D apps any day now. They probably already exist and I just haven't seen them. I would imagine there are still occasional issues with the AI utterly choking; I see it every once in a while on some of my more "fantasy" prompts where I get too specific and it just ignores what I asked.

> I think any NPC with dialogue important to a goal (a quest, a tutorial, etc) is going to be hard to use generative AI for. It not only needs to be coherent with the story, but it needs to correctly include certain ideas. I.e. if the NPC gives a quest to go find some item at some location, it needs to say what the item is and where it is.

That was my experience when I was experimenting with using current LLMs to generate quests. You can of course ask for both a human-readable quest description and also a JSON object (according to some schema) describing the important quest elements, but the failure rate of the results was too high. Maybe 10% of quests would have some important mismatch between the description and the JSON; the description would mention an important object but it would be left out of the JSON, or the JSON would mention an important NPC but the description wouldn't, etc.

As a player, I think it would get frustrating quickly if 10% of quests were unsolvable, especially since, as a player, you don't know when a quest is unsolvable; maybe you just haven't found the item/NPC yet.

Yeah, 10% about jives with what I would expect under the assumption that the generated text needs to be non-deterministic (I.e. no careful prompt tuning and turning the temperature down to basically 0).

An interesting flip side I was just thinking about is the AI saying too much. NPCs keeping secrets until the player gets enough reputation or does a favor or whatever is pretty common. I wonder how good they are at keeping those secrets.

Prompt injection is one thing, and vaguely equivalent to cheat codes which is fine, but what is the likelihood that a player just asking for more info ends with the AI spitting out the secret without completing the quest? Will the AI know to unlock the next area or whatever, because there's no reason for the player to do that NPCs quest?

Should be neat stuff, I'm looking forward to how this all works together when the kinks get ironed out.

To some degree, yes. But, theres a low value to cost ratio in that exact UX.

Take a single character in the game, and enable that character the depth and nuance of a true experience between a Zen Master / Inquiry facilitator, powered by AI. IXCoach.com can do a phenomenal job powering this, so literally the only code needed for an MPV is the mod + character api.

Then, the cost benefit ratio is 400x, and in a day of coding you have taken a game that is mostly pure entertainment, and provided a means for depth, nuance and personal development that literally leads the market.

I pinged the executive producer of CD Project Red on this, it's viable.

https://www.linkedin.com/in/danhernberg/

Current games which are using LLMs only activate the model when the user is talking to the NPC, but in order to create a real dynamic story which is completely random but to the point, the agents need to interact with other as well,so lets say there are around 100 agents in the game they need to interact with each other to generate some emergent behavior. The form of interaction can be questioned here. will it be in natural lang? or just some embeddings or states.

But this thing still has a long way to go.

I agree in the context of LLMs running locally. For API connected games, cloud support for nuanced conversations would be a tremendous value add. Take a hit like Cyberpunk, create a Mod that wires into a custom AI from ixcoach.com... we could literally integrate the most nuanced self inquiry practices into the top games this way.

Anyone working on top games through mods that wants to explore this, let me know, Next AI Labs would be interested in supporting such efforts.

There are mods for skyrim right now that run an NPC's dialog and lore through a small 7B model outputs text dialog. Heck if you wanted you could run a 2B whisper model and get reasonably decent voice output.

It's all very exciting, if a little janky.

If we're just talking about NPCs in a video game, I bet the game studios have the resources to train a very specific LLM optimized for NPCs. Lots of training data could probably be stripped out; after all your average quest-giver in Skyrim doesn't need to know how to implement Black Scholes in Rust.
The problem is that you need two GPUs and the AI one can't be from AMD. We aren't 15 years away. More like two or three. NPUs are coming and DDR6 plus quad channel memory would get you decent performance on small LLMs like llama3.

You're also forgetting that batch performance is already an order of magnitude better than single session inference.

I agree on the most part, but I still think some pretty cool games can come up with local LLMs. Suck up for example, though not local afaik, is a pretty cool one.
There are a few games that use LLMs and voice, they are usually hilariously janky.
Could you name some?
How in the world would this be tested? Anything pertaining to game logic needs to be deterministic.

I can't see LLMs in games being used for anything more than some random NPC voice quips. And whose voice would be used? Would voice actors be okay with this?

There are already too many bad games, we certainly don't need thousands more with AI-generated drivel dialogue, although having human writers is not a panacea either way.

Have other AI agents test the game in thousands of scenarios. Voice actors are not needed, SOTA TTS systems can synthesize a brand new voice from a description.