| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sadpasture 1252 days ago
	I think it has to do with text being much more precise. Your stably diffused cartoon avatar having 6 finger is not nearly as noticeable as a language model's chat mispelling every second word. So you need less resources to get to a human acceptable result

1 comments

no, diffusion models are just more efficient