Hacker News new | ask | show | jobs
by Archerlm 70 days ago
From what I know of, Ollama works offline, but if your ram is only 8 GB and without a GPU, latency would be severely limiting. The closest local model I can suggest is LM Studio (no cloud dependencies, own model registry, etc.). If that doesn't work out for you then GPT4All, but its models are quite small.
1 comments

I'm looking for an agent, but thanks.

I forget the issue with Gpt4all as some have blended together when they weren't suitable for me.

My bad I misread your post. If GPT4all didn't work out, go with Aider. It’s a CLI tool, doesn't have a UI trying to proxy requests to a dev's server. You just point it at your local model (via Ollama or vLLM) and it stays in its lane. Since it’s Python-based, you can grep the source code to confirm there are no hidden update pings. If that's not for you, and you need the IDE experience, pick Continue. It’s the only one that handles air-gapped setups properly. You can manually install the .vsix file and kill all telemetry in the config.json. Unlike OpenCode, it doesn't try to be it’s just a bridge between your code and your model server. OpenCode failed because it’s basically "cloud-first" pretending to be local. Aider and Continue are actually built for what you want.
This was such a useful post, thank you have an upvote! I forgot about Aider.

I was looking into it but got distracted with other work, do you know if it does have any update checks or telemetry? I will check the source but I could miss something so I definitely want to ask people who have used it.

I think I also looked into Continue very briefly. I'm glad you put thos notes about it being more the IDE experience. Also for this one, does it come with instructions on a Github page or something on how to kill all spying/telemetry?

Thanks again!