Hacker News new | ask | show | jobs
by kurthr 521 days ago
Wow, yeah a 10B parameter model is pretty tiny and 300 3-GPU clusters for $18M is not really cheap.

I guess enormous is in the eye of the beholder.