Y
Hacker News
new
|
ask
|
show
|
jobs
by
raindear
799 days ago
But why do transformers perform better than older language models including other neural language models.