Hacker News new | ask | show | jobs
by rudedogg 15 days ago
That seems way off to me.

I skimmed the article, but couldn’t spot any details on their estimates. They mention 70b+ params as being large in several places. But we’ve had several 100b+ param models that trail Sonnet.