|
|
|
|
|
by smcleod
570 days ago
|
|
Neat to see more folks writing blogs on their experiences. This however does seem like it's an over-complicated method of building llama.cpp. Assuming you want to do this iteratively (at least for the first time) should only need to run: ccmake .
And toggle the parameters your hardware supports or that you want (e.g. if CUDA if you're using Nvidia, Metal if you're using Apple etc..), and press 'c' (configure) then 'g' (generate), then: cmake --build . -j $(expr $(nproc) / 2)
Done.If you want to move the binaries into your PATH, you could then optionally run cmake install. |
|
In that case, the steps to building llama.cpp are:
1. Clone the repo.
2. Run `make`.
To start chatting with a model all you need is to:
1. Download the model you want in gguf format that will fit into your hardware (probably the hardest step, but readily available on HuggingFace)
2. Run `./llama-server -m model.gguf`.
3. Visit localhost:8080