Y
Hacker News
new
|
ask
|
show
|
jobs
by
misiti3780
70 days ago
what HW are you running them on ? are you using OLLAMA ?
1 comments
vunderba
70 days ago
I'm using the default llama-server that is part of Gerganov's LLM inference system running on a headless machine with an nVidia 16GB GPU, but Ollama's a bit easier to ease into since they have a preset model library.
https://github.com/ggml-org/llama.cpp
link
https://github.com/ggml-org/llama.cpp