|
|
|
|
|
by diyer22
249 days ago
|
|
Yes, it's absolutely possible—just like how diffusion LLMs work, we can do the same with DDN LLMs. I made an initial attempt to combine [DDN with GPT](https://github.com/Discrete-Distribution-Networks/Discrete-D...), aiming to remove tokenizers and let LLMs directly model binary strings. In each forward pass, the model adaptively adjusts the byte length of generated content based on generation difficulty (naturally supporting speculative sampling). |
|