| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by andrew-w 425 days ago
	One way this differs is in the model architecture. Our approach relies on a single pass of a diffusion transformer (DiT), whereas Live Portrait relies on intermediate representations and multiple distinct modules. Getting a DiT to be real-time was a big part of our work. Quoting the Live Portrait paper: "Diffusion-based portrait animation methods [...] are usually [too] computationally expensive." As you hinted at, we had to compromise on resolution to get there (this demo is 256x256), but we think that will improve over time.

1 comments

Not relying on facial keypoints means we can animate a wide range of non-humanoid characters. My favorite is talking to the Doge meme.