|
|
|
|
|
by yjftsjthsd-h
565 days ago
|
|
You might also try https://github.com/Mozilla-Ocho/llamafile , which may have better CPU-only performance than ollama. It does require you to grab .gguf files yourself (unless you use one of their prebuilts in which case it comes with the binary!), but with that done it's really easy to use and has decent performance. For reference, this is how I run it: $ cat ~/.config/systemd/user/llamafile@.service
[Unit]
Description=llamafile with arbitrary model
After=network.target
[Service]
Type=simple
WorkingDirectory=%h/llms/
ExecStart=sh -c "%h/.local/bin/llamafile -m %h/llamafile-models/%i.gguf --server --host '::' --port 8081 --nobrowser --log-disable"
[Install]
WantedBy=default.target
And then systemctl --user start llamafile@whatevermodel
but you can just run that ExecStart command directly and it works. |
|