Y
Hacker News
new
|
ask
|
show
|
jobs
by
Translationaut
1098 days ago
Those minified models are still equal or bigger compared to the initial "attention is all you need" transformer.