Hacker News new | ask | show | jobs
by kateklink 1016 days ago
We’ve finished training a new code model Refact LLM which took us about a month. The main use-case is for blazing-fast code completion with fill-in-the-middle, additionally, the model could reply to chat prompts.

It has much better performance than all of the code models of similar size, and almost reaches the same HumanEval as Starcoder being 10x smaller in size.

With the small size, it can work with most modern GPUs requiring just 3GB Ram.

You can try self-hosting it in Refact https://github.com/smallcloudai/refact/ and get a local fast copilot alternative with decent suggestions.

Weights and model card https://huggingface.co/smallcloudai/Refact-1_6B-fim.

We would love to hear your feedback!

5 comments

How does it compare to Copilot? A metric I'd like to see is % of proposed completions accepted by a human user. If you had an extension that 50% of the time proposed a Copilot extension and 50% of the time proposed a Refact extension (blind to the user) then you could come up with a metric like this.
Does ctransformer (https://github.com/marella/ctransformers#supported-models) support running refact?

I see that model type "gpt_refact" in https://huggingface.co/smallcloudai/Refact-1_6B-fim/blob/mai...

Is it possible to run it as an LSP so that it can be used in editors other than VSCode and JetBrains? (sorry if this question is completely mad, my understanding of how these things work is extremely limited)
Yes, it's coming up in a couple of weeks.
Great, thanks. I'll keep an eye out.
hi, i try to fine tune refact model using evolve code alpaca, but the loss is always bigger than 2, i try some different params but it doesn't work, can you give me some advice?
> almost reaches the same HumanEval

how can you tell that HumanEval is not leaked to your training data in some form?

Hi! We ran LSH filtering over datasets to remove all code that can be similar to HumanEval samples.
so, we have to trust your procedure..
It can be checked if the model predicts canonical solutions from humaneval. I understand it is not ideal, but at least you can check it yourself

There are a bunch of other benchmarks too, check out the page https://huggingface.co/smallcloudai/Refact-1_6B-fim

Also, feel free to run any new benchmarks