|
|
|
|
|
by nullbio
4 hours ago
|
|
I think OpenAI will still maintain the lead for at least another few years. The cost of hardware still needs to dramatically drop for open-weight models to be viable for local usage. Even with the release of things like Nvidia DGX Spark and Ryzen AI Halo, you'd likely want a few of them to run agents in parallel. Sure, you can use cloud hosted variants of models like DeepSeek etc at API rates, but subscriptions still come out on top for bulk usage. GPT is already tightly integrated into peoples workflows, has wide adoption, has good tooling for developers, etc. Plus there's nothing stopping them from competing on a price level if they really feel the need. It just means they might burn more cash in the short term. |
|
It's more efficient to do the opposite on a constrained platform. Run agents in parallel using a single model, then round-robin among models for cross-checking purposes. (The makers of local inference engines are dropping the ball by not making batched inference a first-class citizen of that workflow. It's not just useful for vLLM and SGlang.)