|
|
|
|
|
by vessenes
409 days ago
|
|
“Every” is unsubstantiated but probably accurate. Meta has published theirs (behemoth) and it’s clear this is largely how frontier models are being used and trained right now: too slow and expensive for daily driving inference, distillable at various levels for different tradeoffs. |
|