Hacker News new | ask | show | jobs
by kleiba 614 days ago
Neural LMs used to be based on recurrent architectures until the Transformer came along. That architecture is not recursive.

I am not sure that a diffusion approach is all that suitable for generating language. Word are much more discrete than pixels.

1 comments

I meant sequential generation, I didn't mean using an RNN.

Diffusion doesn't work on pixels directly either, it works on a latent representation.

All NNs work on latent representations.
The contrast here is real: there are pixel space diffusion models and latent space diffusion models. Pixel space diffusion is slower because there's more redundant information.