Y
Hacker News
new
|
ask
|
show
|
jobs
by
derac
768 days ago
It's one model with text/audio/image input and output.
1 comments
jacobsimon
768 days ago
Very exciting, would love to read more about how the architecture of the image generation works. Is it still a diffusion model that has been integrated with a transformer somehow, or an entirely new architecture that is not diffusion based?
link