Hacker News new | ask | show | jobs
by yorwba 4 days ago
The intrinsic limitation of text diffusion is that natural text contains serial dependencies where a word at the beginning of the text strongly influences what comes later, and if there is a long enough dependency chain within a diffusion block, the small number of diffusion steps may not be enough to resolve all dependencies, so that you end up with incoherent output.
1 comments

The obvious solution is to simply do more steps for larger sequences though, right?

How exactly does this work with CoT?