Hacker News new | ask | show | jobs
by K0balt 424 days ago
The only bar to using local is having the hardware and downloading the model. I find it nominally easier to use than using the openAI API since the local API isn’t picky about some of the fields (by default). Agentic flows can use local 90percent of the time and reach out to god when they need divine insight, saving 90 percent of token budgets and somewhat reducing external exposure, though I prefer to keep everything locally if possible. It’s not hard to run a 70b model locally, but the queue can get backed up with multiple users unless you have very strong hardware. Still, you can shift overflow to the cloud if you want.