Hacker News new | ask | show | jobs
by ironbound 152 days ago
Other teams did a better job and provided code

https://github.com/kuleshov-group/bd3lms

1 comments

This is unrelated. They both use the word "block", but what they are referring to differs
"Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models"
Yes and? The paper I linked is about network weights, not the type of generative model