Hacker News new | ask | show | jobs
by tarvaina 1164 days ago
Not all transformers have separate encoders and decoders. GPTs, for instance, only have the equivalents of decoder layers of the original transformer paper, but they are still considered transformers. Karpathy’s video shows an actual GPT-style transformer.

I think a neural network can be considered a transformer if it contains a stack of attention blocks as its core mechanism.