Y
Hacker News
new
|
ask
|
show
|
jobs
by
sdpmas
104 days ago
diffusion is promising, but still an open question how much data efficient they are compared to AR. in practice, you can also train AR forever with high enough regularization, so let's see.
1 comments
_0ffh
104 days ago
Yes, it could go either way of course.
Still, just for reference, here's the paper I remembered:
https://arxiv.org/pdf/2507.15857
link
sdpmas
104 days ago
thanks, here's another one:
https://arxiv.org/abs/2511.03276
link
Still, just for reference, here's the paper I remembered: https://arxiv.org/pdf/2507.15857