Y
Hacker News
new
|
ask
|
show
|
jobs
by
tigershark
517 days ago
The biggest model that they have used has only 760M parameters, and it outperforms models 1 order of magnitude larger.
1 comments
NotAnOtter
516 days ago
Gah dmn
link