| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by pistachiopro 1552 days ago
	This is indeed rendered in realtime, but one thing to note is it's a "4D" capture, more-or-less meaning each frame of the animation is its own asset. This makes it possible to reproduce subtle physics like the lips sticking together slightly when the actor opens her mouth. The amount of storage space, alone, makes this impractical for anything other than demos. Unity claims they will be able to achieve this level of fidelity using a deep learning-based compression that will allow stuff like this to appear in game cutscenes, but all the movements will still be pre-baked. The only interaction possible will be moving the camera. At that point the technology will be very useful, but it's still a ways away from having such a realistic character that can react to you dynamically. (Though whether that's just a couple years of software technology progress, or a decade+ for hardware progress, who can say?)

6 comments

dylan604 1552 days ago

>achieve this level of fidelity using a deep learning-based compression

what does that mean? To me, they might have just as well said middle out compression.

link

greysphere 1552 days ago

I'd guess a 4d version of: https://paperswithcode.com/method/nerf

link

teddykoker 1550 days ago

This might be what you’re looking for: https://gafniguy.github.io/4D-Facial-Avatars/

link

lagrange77 1552 days ago

I guess this means dimensionality reduction for example with the use of a convolutional autoencoder.

link

pistachiopro 1552 days ago

Unity recently acquired Ziva, which specializes in the detailed animation of humans and other animals. They were known for their (not realtime) physics-based solutions, but now they have an ML model for faces, apparently. As far as I know, it's still in beta and not widely available. Unity says they will re-release this demo with the Ziva face in a matter of weeks and the quality will be even higher. And possibly allowing interactivity as well?? I guess we'll see in a few weeks.

link

motoboi 1552 days ago

Superresolution. You have a lower resolution animation (less pixels = less calculations) and then use superresolution to turn that into a 4K image. This is reality right now for NVIDIA GPUs ( I think it’s called DMSS)

link

cma 1552 days ago

They are talking about compressed geometry, not pixels. This is more similar to alembic and other geometry streaming tech https://en.wikipedia.org/wiki/Alembic_(computer_graphics)

There is one out there from 5 years ago or so that is similar to Google's Seurat but for animated stuff, I think pre-baking triangle culling for different views within a limited volume. I can't remember the name of it, from the details I remember (there was a realistic orangutan or something like that rendered with fur) I should be able to find it on Google, but Google search has become degraded recently.

link

pistachiopro 1552 days ago

Nvidia DLSS is an important part of how they achieved 30Hz at 4k resolution, but that's more of a shading assist and doesn't affect the animation. The facial animation will be compressed with Ziva's ML solution.

link

subpixel 1552 days ago

aka "enhance": https://www.youtube.com/watch?v=Vxq9yj2pVWk

link

sebzim4500 1552 days ago

If it's only for cutscenes why not just have a video?

link

Jare 1552 days ago

Cutscenes work a lot better (more immersive) if they can correctly reflect runtime-defined assets, e.g. your own character with your customizations, gear and clothes, etc, or the dynamic state of the environment in which gameplay was happening: destruction debris, current time of day, and such.

link

TylerE 1552 days ago

Plus cutscenes get a lot bigger when you’re doing 4k60 and not 1080p30

link

softfalcon 1552 days ago

Cause they want to push the limits and make their engine look amazing. Also, if they research hard enough, in-game becomes nearly as good as video to the point you can’t tell.

One baby step at a time.

link

dylan604 1552 days ago

Because actors are expensive. Reshoots are even more of a cost if things change.

With this, you just have the character do exactly what you want, when you want, without needing to talk to an agent.

link

mkl 1551 days ago

Video doesn't mean with a camera, just pre-rendered.

link

alternatetwo 1550 days ago

Because video cutscenes always look crap 5 years later. Easily distinguishable from in-game rendered.

Of course most games don't care about 5 years later, but it still looks crap.

link

andybak 1552 days ago

This is a fair point.

link

NHQ 1552 days ago

Movement won't be pre-baked, a physics engine sim will be baked in to the neural network, and movements will be another dimension for the deep learning network. And then all of that will be baked into an agent that has been trained to carry out motives (with a simulation of your character, etc). The same applies to speech as movement. And the deep-learned compression rate will be magnificent.

link

CyberDildonics 1552 days ago

What led you to this bold prediction of the future?

link

NHQ 1547 days ago

No predictions, just an explainer of how AI agents are trained. For instance, RL is about presenting an environment via rules (gravity, etc), and letting the agent learn its way around, thus discovering what it can and cannot do (a policy for the environment).

link

CyberDildonics 1546 days ago

You didn't explain how anything actually works, you gave a very crude prediction with a lot of holes of how you think something will work in the future.

link

kingcharles 1552 days ago

[citation required]

link

Melatonic 1551 days ago

This type of tech though is heavily used by the film industry already though - dynamically reacting is not much of a concern there at all.

link

zokier 1552 days ago

They also say

> Tension tech for blood flow simulation and wrinkle maps, eliminating the need for a facial rig for fine details

Which sounds like it is not 100% prebaked animation?

link