Hacker News new | ask | show | jobs
by unixpickle 1799 days ago
I think you are conflating "Transformers" and "autoregressive models". Transformers are a general purpose architecture for transforming sequences into other sequences with self-attention. AR models / GANs are frameworks for generative modeling. The model architecture is almost entirely orthogonal to the generative framework.

You can use transformers as part of GANs [1], and you can even use them as discriminative models for images [2].

[1] https://arxiv.org/abs/2102.07074 [2] https://arxiv.org/abs/2010.11929