Hacker News new | ask | show | jobs
by coolKid721 301 days ago
I do not get the point of this at all, why not just generate game assets and run them in an engine? With this format there would be no regularity that the thing you saw before will look the same (and that is not a fixable problem).

Actually figuring out and improving AI approaches for generating consistent and decent quality game assets is actually something that will be useful, this I have no idea the point of past a tech demo (and for some reason all the "ai game" people do this approach).

8 comments

> I do not get the point of this at all

Dunno, this seems like an avenue definitely worth exploring.

Plenty of game applications today already have a render path of input -> pass through AI model -> final image. That's what the AI-based scaling and frame interpolation features like DLSS and FSR are.

In those cases, you have a very high-fidelity input doing most of the heavy lifting, and the AI pass filling in gaps.

Experiments like the OP's are about moving that boundary and "prompting" with a lower-fidelity input and having the model do more.

Depending on how well you can tune and steer the model, and where you place the boundary line, this might well be a compelling and efficient compute path for some applications, especially as HW acceleration for model workloads improves.

No doubt we will see games do variations of this theme, just like games have thoroughly explored other technology to generate assets from lower-fidelity seeds, e.g. classical proc gen. This is all super in the wheelhouse of game development.

Some kind of AI-first demoscene would be a pretty cool thing too. What's a trained model if not another fancy compressor?

Compared to just coming up with and having solid systems for generating game assets? Actually having decent quality style consistent 3d models, texture work, animation, sound effects, etc (especially if it were built into say a game engine or something) would actually revolutionize indie game dev. Games are fundamentally artistic works so yes anything decent will actually require tailoring and crafting, AI set up to serve those people actually makes sense and is totally technically feasible are way easier problems to solve.

And no if you heavily visually modify something with AI models to the extent it significantly alters the appearance it simply has no way of being consistent unless you include the previously generated thing somehow which then has the huge problem of how do you maintain that over an 80 hour game? How do you inform the AI what visual elements (say text, interactive elements) are significant and can be modified and which can't? (You can't)

Actually using AI to generate assets, having a person go in to make sure they look good together and make sure they match then just saving them as textures so they function like normal game assets makes 10000x more sense then generating a user image then trying to extract "hey what did the wrapper of this candy bar look like" from one ai generated image and figuring out how to make sure that is consistent across that type of candy bar in the world and maintains that consistency throughout the entire game, instead of just you know, generating a texture of a candybar?

I think you're making a lot of good points for the current SOTA.

That being said, it took us a few hundred years all in all just to work out paint, so if people keep working with this tech eventually a game designer could, in theory, lay out the skeleton of a game and tell an AI to do the rest, give it pointers and occasional original art to work into the system, and ship a completed playable game in days.

Whether it will be worth playing those games is an entirely different enchilada to microwave.

> lay out the skeleton of a game and tell an AI to do the rest, give it pointers and occasional original art to work into the system, and ship a completed playable game in days

"But, think of the indie game designer!" is getting to be quite the take.

We have a machine that produces slop and the selling point is how fast it produces it? And how more people should be using it to spend less time on creative aspects? Would the world be a better place if GRRM "finished" his most well-known work sooner rather than never?

Something about the phrase "tell an AI to do the rest, give it pointers" reminds me of "The Sorcerer's Apprentice" from Fantasia. Not in the surface level dire-warning about laziness, automation, and losing control that story is telling but in that Mickey didn't spend any time thinking about what he was doing and the Wizard's disappointment at the end.

This whole attitude is just the attitude of tech demos. Nothing good or worthwhile will be "completed playable game in a couple days" actually making anything good takes a huge amount of time and effort and thought. Empowering a small indie studio or solo indie dev so they could make something AA or AAA quality should be the actual goal. If you have 4 people able to make a skyrim level game in a year or two that's an insane feat, that should be the goal. Not someone who doesn't give a shit throwing some prompt and making some slop game that is exactly like 500 other slop games people generate with one prompt.

Like with that tech what kind of games would say random solo developers plugging at it and refining it be able to make in 4 years, that is the extremely compelling stuff. One person being able to make some auteur AAA quality game on their own, even if it takes a long time that might actually be good. If there are AI games those are the ones I'd want to play.

*Edit: misread the post I replied to, disregard comment contents*

> Plenty of game applications today already have a render path of input -> pass through AI model -> final image.

Where "plenty" equals zero?

> In those cases, you have a very high-fidelity input doing most of the heavy lifting, and the AI pass filling in gaps.

That is, in these cases you already have high-fidelity input in the form of an actual game, and "AI" contributions to the output are dubious at best.

Do you really believe it's DLSS that's doing all the heavy lifting in a game like Expedition 33 or Cyberpunk 2077?

Here’s a recent demo made by researchers at Nvidia trying to render the most detailed, realistic scene possible using AI tech to a small degree -mostly as asset compression, but also in material modeling beyond what direct approaches can currently handle in real time: https://m.youtube.com/watch?v=0_eGq38V1hk

Here’s a video from a rando on Reddit conveniently posted today after playing around for an afternoon: https://www.reddit.com/r/aivideo/comments/1n1piz4/enjoy_nano...

The Nvidia team carefully selected a scene to be feasible for their tech. The rando happened to select a similar-ish scene that is largely well suited for game tech.

Which scene looks more realistic? The rando’s by far. And, adding stylization is some AIs are very well established as being excellent at.

Yes, there are still lots of issues to overcome. But, I find the negativity in so many comments in here to be highly imbalanced.

that rando's one looks like fucking garbage like a bunch of shitty B roll footage from an AI generated advertisement for some kind of pharmaceutical, the space and layout of the world clearly is constantly shifting and makes no sense. just generate a fucking map we already know how to do this, tracking geometry is an EASY problem for computers, don't force some AI shit to try to do it for some reason.
There's something so unsettling about that 2nd link you posted, and I don't mean from an "AI is impressive" POV (I think it looks like absolute garbage but will probably continue improving bit by bit). There's something even in the still frames which scratches my brain in a very unpleasant way, it's hard to describe the sensation. The closest thing is "disgust", maybe it's a uncanny valley type of effect.

Also, how are these 2 comparable at all? Obviously the video looks more "realistic", the first one is obviously a game demo of some kind and is stylized, whereas the latter looks like a terrible travel agency ad.

> how are these 2 comparable at all. the latter looks like a terrible travel agency ad.

Don't focus on the emotional context of the scenes. Look at the physical content. They are both contain stonework, plants and a human. They are both large, detailed scenes lit by strong sunlight and a bright sky. As far as rendering technology requirements go, they are very similar.

> scratches my brain in a very unpleasant way, it's hard to describe the sensation. The closest thing is "disgust"

There is strong discontinuous motion effect because of how the video tech is based on a sequences of "first frame, last frame" inputs spliced together. There are a couple seconds where her behavior gets uncanny valley --particularly her hands on the door. Wouldn't be a concern in an unrealistic video game. But, would be in a this-realistic game regardless of the tech. And, there is a very slight warble in the fine details. But, you have to really look for those to distinguish them from MPEG artifacts.

I expect what a lot of people here are feeling when they watch the video is disgust based 90% or more just on the fact that the video was AI generated :/

hack, learn and have fun, that's it.
What are we learning here, and are we still bound by ethics?
The tech will improve to far exceed the capabilities of a game engine. Real time improvisation and infinite choices, scope, etc.

It makes no sense when people say AI can't do this or that. It will do it next week.

I’m looking forward to the day when magical thinking such as this gets grounded again. That is when the real work will start anew.
Having spent 20 years making game engines and the past 4 years playing with AI image gen, I believe you are right.

There have been musings for a while now that 3D rendering is going to switch from “lay down the scene’s albedo & specular parameters then do the lighting pass” to “lay down the scene’s latent parameters and then do the diffusion pass”.

Recently, the advances in “real time AI world models” have been coming ridiculously fast.

Put these together and it’s no stretch at all imagining a game built by having artists go nuts doing whatever they want with whatever Maya can handle as long as they also make proxy geometry of trivial complexity that can be conceptually associated with the final renders. Train the AI on the association. Render the proxy geometry the old fashioned way. AI that up in real time to the associated Maya-final-render approximation.

It’s not going to happen this week. But, in 5 years? Somebody’s gonna pull it off.

>It’s not going to happen this week. But, in 5 years? Somebody’s gonna pull it off.

in a tech demo, sure. In a real product? I doubt it. We have neat tech demos from 20 years ago that amounted to not much, because implementing it in a real business product was unviable, unweildy, or didn't meet standards.

There's a lot to get away with with demos. Not so much in a product you want to sell. Much less so for a deluxe product with a bunch of competition like games.

"It makes no sense when people say AI can't do this or that. It will do it next week."

So full self driving vecicles will be finally ready next week then? Great to hear, though to be honest, I remain sceptical.

Waymos are driving themselves around several cities right now.
In full self driving mode, meaning with no human overlooking and correcting?
Yes, go to rome. I will be impressed when we have self driving cars in rome.
They are in San Francisco today so it's not like they are doing this on easy mode.
But are they doing it without human intervention?
Wait til it snows.
The discussion around self-driving cars often feels like shifting goalposts: each time one feature is achieved, a new requirement is added, perpetually delaying the "final" answer.

Self-driving cars are "here"... until someone adds another requirement.

I mean by your logic self driving cars were invented back when we put a steam engine on some tracks in the 1800s. Of course the goalposts shift when the hypesters are trying to sell you on an idea like "AI will be able to do literally everything next week".

Yes, Waymo can today drive around extremely dense car-friendly cities that are scanned and mapped in great detail weekly... They also still have to have remote human intervention all the time, and are freaked out by traffic cones being placed on the hood. I grew up in Indonesia and that's where I learned to drive, and trust me, if Waymo is ever able to navigate 100 meters on any road in Jakarta I'll happily concede and consider self-driving to be a solved problem.

For me it's always been cars without steering wheels built on a factory line.
Operating in the snow is not a niche requirement.
You really need to update your language model because self driving cars have been driving around on their own for at least a year now
So have they stopped having the >1 average remote drivers for each self driving vehicle as well?

The problem with these statements is language has so much context implicit in it. "driving around on their own" to me means with zero active oversight. "driving around" to me means not just in a small set of city streets, but as a replacement for human driving (eg anywhere a vehicle can physically fit). Obviously to you it means other things, but it's what makes these conversations and statements of fact challenging.

That >1 spec is from Cruise, who went defunct in 2023.

Tesla's have >1 but they are not really self-driving, but more "100% human supervised self-driving."

"Full self driving" was the term used and I believe the distinction is relevant to the point being made.
I understand the point you're making, but I think it's not a good one.

The failure mode for getting a self-driving car right is grave. The failure mode for rendering game graphics imperfectly is to require a bit of suspension of disbelief (it's not a linear spectrum given the famous uncanney valley, etc., I'm aware). Games already have plenty of abstract graphics, invisible walls, and other cludges that require buy-in from users. It's a lot easier to scale that wall.

Not my point, but I agree with it so.

The statement was one of capability. There are some things that the tech is flatly not capable of, and that it will take time to develop the capability of. Even if there were no safety concerns at all and we lived in a cotton candy bubble world, self driving cars still have hard failure modes. The tech is not capable, and will not develop the capability next week, either.

The point being made is that the tech is moving fast, at least according to the marketing, but a revolution is not happening ever week. "This is the worst it'll ever be" is an increasingly tired refrain when things seem to be stagnating more than ever. The mentioned behavior will take longer a good amount of time, it's silly to wait around for it when it is not unlikely it may never come.

I'm a game engine programmer and part of my job is to figure how how to "do it next week". I don't see it happening with the we we use LLM's right now. We don't have time to push data to a cloud and come back and render a scene, we have 16ms per frame on most projects (maybe 33 if you are doing something cinematic).

You're gonna have to break the light barrier before thinking about how to do real time Ai rendering at this rate.

Why would I want any of that? Games are interesting because of the deliberate choices and limitations made by a developer. When you have a game that tries to do everything, you have a game that actually does nothing.
An interactive feedback loop that handles various edge cases of AI, rendering it, asset loading and display, keeping track of global data, user input, etc. -- is still a game engine.
> (and that is not a fixable problem)

Genie 3 is incredible (relative to previous models) in this regard. Not a solve, but it is doing what it is not supposed to be able to do.[1]

[1]https://deepmind.google/discover/blog/genie-3-a-new-frontier...

It's an interesting tech demo—I think one interesting use case for AI rendering is changing the style on the fly. For example, a certain power-up could change the look to a hyper-saturated comic book style. Definitely achievable with traditional methods, but because AI is prompt-based, you could combine or extend styles dynamically.
Visually speaking, there's always visual issues in tying disparate assets together in a seamless fashion. I can see how AI could be easily used to "hide the seams" so to speak. I think a hybrid approach would be an improvement definitely.
AI can't even hide its own seams. The seams are kind of the defining characteristic of AI.
> I do not get the point of this at all

I think the point of this is just because it's cool. So, as you said, it only serves as a tech demo, but why not? Many things have no point. It's unreasonable, but it's cool.