|
|
|
|
|
by intalentive
393 days ago
|
|
This explanation is intuitive:
https://www.youtube.com/watch?v=zc5NTeJbk-k My takeaway is that diffusion "samples all the tokens at once", incrementally, rather than getting locked in to a particular path, as in auto-regression, which can only look backward. The upside is global context, the downside is fixed-size output. |
|