|
|
|
|
|
by lolinder
1092 days ago
|
|
The original paper[0] that laid the foundation for modern LLMs was demonstrated on machine translation tasks. It's one of the primary use cases these architectures were designed for. What other types of models do you have in mind that outperform them? [0] "Attention Is All You Need" https://arxiv.org/pdf/1706.03762.pdf |
|