|
|
|
|
|
by simjnd
57 days ago
|
|
For TurboQuant on model weights AFAIK it's currently a single person effort [1]. It needs his fork of llama.cpp, hasn't been upstreamed. He publishes his quantizations on HuggingFace but I'm not sure if he open-sourced the quantization pipeline. [1]: https://x.com/coffeecup2020 |
|