Hacker News new | ask | show | jobs
by scriptsmith 1028 days ago
How are people using these local code models? I would much prefer using these in-context in an editor, but most of them seem to be deployed just in an instruction context. There's a lot of value to not having to context switch, or have a conversation.

I see the GitHub copilot extensions gets a new release one every few days, so is it just that the way they're integrated is more complicated so not worth the effort?

3 comments

You can use Continue as a drop-in replacement for Copilot Chat with Code Llama. We've released a short tutorial here: https://continue.dev/docs/walkthroughs/codellama. It should save you a lot of time context-switching; you can just highlight code and ask questions or make edits, all with keyboard shortcuts
For in-editor like copilot you can try this locally - https://github.com/smallcloudai/refact

This works well for me except the 15B+ don't run fast enough on a 4090 - hopefully exllama supports non-llama models, or maybe it'll support CodeLLaMa already I'm not sure.

For general chat testing/usage this works pretty well with lots of options - https://github.com/oobabooga/text-generation-webui/

>This works well for me except the 15B+ don't run fast enough on a 4090

I assume quantized models will run a lot better. TheBloke already seems like he's on it.

https://huggingface.co/TheBloke/CodeLlama-13B-fp16

Unfortunately what I tested was StarCoder 4bit. We really need exllama which should make even 30b viable from what I can tell.

Because codellama is llama based it may just work possibly?

http://cursor.sh integrates GPT-4 into vscode in a sensible way. Just swapping this in place of GPT-4 would likely work perfectly. Has anyone cloned the OpenAI HTTP API yet?
LocalAI https://localai.io/ and LMStudio https://lmstudio.ai/ both have fairly complete OpenAI compatibility layers. llama-cpp-python has a FastAPI server as well: https://github.com/abetlen/llama-cpp-python/blob/main/llama_... (as of this moment it hasn't merged GGUF update yet though)
I was tasked with a massive project over the last month and I'm not sure I could have done it as fast as I have without Cursor. Also check out the Warp terminal replacement. Together it's a winning combo!