Hacker News new | ask | show | jobs
by phantom32 1180 days ago
I wonder if FauxPilot's models (Salesforce Codegen family) can be quantized and run on the CPU. I was able to run the 350M model on my machine but it wasn't able to compete with Copilot in any way. Salesforce claims their model is competitive with OpenAI Codex their github description[1]. Maybe their largest 16B model is, but I haven't been able to try it.

[1] https://github.com/salesforce/CodeGen

1 comments

We will add quantized CodeGen for fast inference on CPUs up on cformers (https://github.com/NolanoOrg/cformers/) by later today.
> by later today

Wow, that's the timeframe things are moving at right now, we better get used to it!

Whoa is there a PR or wiki about this
4bit GPTQ maybe?