|
|
|
|
|
by mzl
195 days ago
|
|
While it is possible to self-host small models, it is not easy to host them with high speeds. Many small-model use-cases are for large batches of work (processing large amounts of documents, agentic workflows, ...), and then using a provider that has high tps numbers would be motivated. Still, I agree that self-hosting is probably a part of the decrease. |
|