| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by c-hendricks 5 days ago

Nice, was wondering if there was a flag for the draft as well.

Not knocking huggingface-cli, just find it's much easier for people to try out this stuff when they can just

  mise use --global github:ggml-org/llama.cpp
  LLAMA_CACHE="models" llama-server \
    -hf unsloth/gemma-4-26B-A4B-it-qat-GGUF:UD-Q4_K_XL \
    --host 0.0.0.0 \
    --port 11434 \
    ...

1 comments

dofm 5 days ago

  —no-mmproj

is also pretty useful if you're doing this just to try agentic coding and you're not processing images/voice. Stops it downloading the multimodal projector.

link