Hacker News new | ask | show | jobs
by ankraft 1162 days ago
That sounds interesting. How can I obtain this Vicuna model that works with llama.cpp?
1 comments

https://medium.com/@martin-thissen/vicuna-on-your-cpu-gpu-be...

See the section "CPU Installation (GGML Quantised)"

You need Python to download the model from HuggingFace using the official API. After that, all you need is the binary file with weights and a compiled binary of llama.cpp

P.S. The author seems to have renamed their repo to "eachadea/legacy-vicuna-13b" on HuggingFace