Hacker News new | ask | show | jobs
by ayushkaushal 1185 days ago
We will add quantized CodeGen for fast inference on CPUs up on cformers (https://github.com/NolanoOrg/cformers/) by later today.
3 comments

> by later today

Wow, that's the timeframe things are moving at right now, we better get used to it!

Whoa is there a PR or wiki about this
4bit GPTQ maybe?