|
|
|
|
|
by zozbot234
76 days ago
|
|
> These models are dumber and slower than API SoTA models and will always be. Sure but you're paying per-token costs on the SoTA models that are roughly an order of magnitude higher than third-party inference on the locally available models. So when you account for per-token cost, the math skews the other way. |
|