Hacker News new | ask | show | jobs
by dragonwriter 452 days ago
Autoregressive LLMs don't usually work on tree structures, they work on capped-length linear token sequences, which are isomorphic to fixed-length sequences.

I'm not sure why you think working on tree structures rather than fixed length sequences would be necessary for diffusion language models—which, again, actually exist; aside from Mercury which is proprietary, there is also LLaDA: https://ml-gsai.github.io/LLaDA-demo/