| HN Mirror

https://medium.com/@martin-thissen/vicuna-on-your-cpu-gpu-be...

See the section "CPU Installation (GGML Quantised)"

You need Python to download the model from HuggingFace using the official API. After that, all you need is the binary file with weights and a compiled binary of llama.cpp

P.S. The author seems to have renamed their repo to "eachadea/legacy-vicuna-13b" on HuggingFace