Y
Hacker News
new
|
ask
|
show
|
jobs
by
fishpham
129 days ago
Those won’t be sufficient to run SOTA/trillion parameter models
2 comments
Zambyte
129 days ago
And most tasks don't demand that.
link
general1465
129 days ago
Distilled models are good enough.
link