Hacker News new | ask | show | jobs
by slver 1829 days ago
I wouldn't call it scamming, but 173MB is not small at all. At the resolution of this model, you can easily fit the entire Titanic movie in 173MB. Maybe even have enough space for audio.

Furthermore no one is saying the model "memorized every possible combo". However imagine you have a set of keyframes (maybe even multiple fragments per frame) and you need to interpolate between them? Not that hard of a task, isn't it.

Models don't care about simulating our "intention" properly. They care about fitting the input in the simplest way possible. Think about a model like a lazy worker merely trying to look like it's working.

None of this makes NN less exciting, but it should inform us you can't go 0 to 60 in one step and hope the NN would have great insight about what it's doing.

We need models that make smaller conceptual jumps, i.e. models that understand 3D space, then models which understand transformations in 3D space, then models which understand citicscape, etc. etc.

3 comments

It sounds like you and others are trying to clarify how this demo doesn't live up to your idealized, subjective expectations. Noone is claiming this to be a revolutionizing or even useful video game engine.

It's a neural network that recreates a limited, yet fully dynamic gameplay segment only based on player input. It's a really neat and fun project.

I think it's quite telling that you point to me about having idealized, subjective expectations and then describe the demo as "limited yet fully dynamic gameplay". It rotates the car to left or right depending on whether you press left or right.

It's super-interesting but it doesn't recreate limited fully dynamic gameplay. It doesn't recreate any sort of dynamic gameplay. That's your idealized, subjective interpretation.

The driving seems pretty dynamic to me. Maybe "fully" was a bit hyperbolic, as I can't really justify or quantify what that would entail. On the other hand, saying that it's not dynamic at all seems equally misguided. Also you seem to disregard the "limited" and "segment" qualifiers which was there for a reason.
> However imagine you have a set of keyframes (maybe even multiple fragments per frame) and you need to interpolate between them? Not that hard of a task, isn't it.

Intrestingly, the video artifacts of this model look somewhat similar to those from simple motion interpolation algorithms such as ffmpeg's minterpolate, especially during fast camera motion. https://ffmpeg.org/ffmpeg-filters.html#minterpolate

Edit: I generated an example with strong artifacts. Input: https://mscharrer.net/tmp/lowfps.webm Output: https://mscharrer.net/tmp/minterpolate.webm

Memorizing a static succession of frames with nothing actually being dynamic and interactive isn't the same challenge as this.