|
|
|
|
|
by dragonwriter
452 days ago
|
|
Autoregressive LLMs don't usually work on tree structures, they work on capped-length linear token sequences, which are isomorphic to fixed-length sequences. I'm not sure why you think working on tree structures rather than fixed length sequences would be necessary for diffusion language models—which, again, actually exist; aside from Mercury which is proprietary, there is also LLaDA: https://ml-gsai.github.io/LLaDA-demo/ |
|