I'd love to host my own LLMs but I keep getting held back from the quality and affordability of Cloud LLMs. Why go local unless there's private data involved?
There are some use cases I use LLMs for where I don't care a lot about the data being private (although that's a plus) but I don't want to pay XXX€ for classifying some data and I particularly don't want to worry about having to pay that again if I want to redo it with some changes.
Using local LLMs for this I don't worry about the price at all, I can leave it doing three tries per "task" without tripling the cost if I wanted to.
It's true that there is an upfront cost but way easier to get over that hump than on-demand/per-token costs, at least for me.
Same. For 'sovereignty ' reasons I eventually will move to local processing, but for now in development/prototyping the gap with hosted LLM's seems too wide.
The $3000 that a MBP M3 Max with 64GB of RAM costs might cover a round trip business class ticket for a trans pacific…if it is on sale (a Chinese carrier probably with GFW internet).
Some of us don't have the most reliable ISPs or even network infrastructure, and I say that as someone who lives in Spain :) I live outside a huge metropolitan area and Vodafone fiber went down twice this year, not even counting the time the country's electricity grid was down for like 24 hours.
Using local LLMs for this I don't worry about the price at all, I can leave it doing three tries per "task" without tripling the cost if I wanted to.
It's true that there is an upfront cost but way easier to get over that hump than on-demand/per-token costs, at least for me.