| > within a few years we will be running local models as good as today’s frontier models with almost no cost burden Based on what? The RAM requirements alone are extraordinary. No, running large models on shared, dedicated hosted hardware at full utilization is going to be vastly more cost-efficient for the foreseeable future. |
I take it you haven’t actually run any of the current gen local models?
They all fit on fairly accessibility hardware, and their performance is at least on par with what I was paying for last year.
I have one of my agents running entirely from a local model running on a MBP and it has repeatedly shown it’s capable of non-trivial tasks.
Playing around with another, uncensored, local model on my 4090 desktop has me finally thinking about canceling my personal Anthropic subscription. Fully private, uncensored chat is a game changer.
For work it’s still all private models but largely because, at this stage, it’s worth paying a premium just to be sure you’re using the best and it saves the time of managing out own physical servers. But if we got news tomorrow that Anthropic and OpenAI were shutting down, a reasonable setup could be figured out pretty quickly.