|
|
|
|
|
by pests
399 days ago
|
|
One distinction is the original transformer was an encoder/decoder while (most?) LLMs today are encoder only. The translation transformer also was able to peek ahead in the context window while (most?) LLM's now only consider previous tokens. |
|