|
|
|
|
|
by logicchains
973 days ago
|
|
Most of the core idea of transformers was invented in the early 1990s, including what at the time were termed Fast Weight Programmers and are formally equivalent to linearised self-attention: https://people.idsia.ch/~juergen/fast-weight-programmer-1991... . Google just had the hardware to actually run and experiment with large transformers. |
|