| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Alex-Programs 483 days ago

This is a crazy paper. A first-generation diffusion model is beating LLama 3 in some areas, a model with a huge amount of tuning and improvement work. And it's from China again!

A whole new "tree" of development has opened up. With so many possibilities - traditional scaling laws, out-loud chain of thought, in-model layer-repeating chain of thought, and now diffusion models - it seems unlikely to me that LLMs are going to hit a wall that the river of technological progress cannot flow around.

I wonder how well they'll work at translation. The paper indicates that they're rather good at poetry.

Interesting times.