| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by danielbln 1180 days ago
	Generative image models don't use transformers, they're diffusion models. LLMs are transformers.

2 comments

Diffusion models can use a transformer architecture, example: DiT. Stable Diffusion is using a U-Net architecture with transformer blocks.

Ah yes that's right. Well they technically do use a visual transformer for CLIP text encoder as I understand.