Hacker News new | ask | show | jobs
by fishpham 129 days ago
Those won’t be sufficient to run SOTA/trillion parameter models
2 comments

And most tasks don't demand that.
Distilled models are good enough.