Hacker News new | ask | show | jobs
New Mixtral HQQ Quantzied 4-bit/2-bit configuration (huggingface.co)
5 points by ibuildthings 916 days ago
1 comments

We are releasing new 2-bit Mixtral models. These ones use a mixed HQQ 4-bit/2-bit configuration, resulting in a significantly improved model (ppl 4.69 vs. 5.90) with a negligible 0.20 GB VRAM increase.

Base: https://huggingface.co/mobiuslabsgmbh/Mixtral-8x7B-v0.1-hf-a...

Instruct: https://huggingface.co/mobiuslabsgmbh/Mixtral-8x7B-Instruct-...

Shout-out to Artem Eliseev and Denis Mazur for suggesting this idea ( https://github.com/mobiusml/hqq/issues/2 )