| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by monkmartinez 409 days ago

If you ollama pull <model> the modelfile will be downloaded along with the blob. To modify the model permanently, you can copypasta the modelfile into a text editor and then create a new model from the old modelfile with the changes you require/made.

Here is my workflow when using Open WebUI:

1. ollama show qwen3:30b-a3b-q8_0 --modelfile

2. Paste the contents of the modelfile into -> admin -> models -> OpenwebUI and rename qwen3:30b-a3b-q8_0-monkversion-1

3. Change parameters like num_gpu 90 to change layers... etc.

4. Keep | Delete old file

Pay attention to the modelfile, it will show you something like this: # To build a new Modelfile based on this, replace FROM with: # FROM qwen3:30b-a3b-q8_0 and you need to make sure the paths are correct. I store my models on a large nvme drive that isn't default ollama as an example of why that matters.

EDIT TO ADD: The 'modelfile' workflow is a pain in the booty. It's a dogwater pattern and I hate it. Some of these models are 30 to 60GB and copying the entire thing to change one parameter is just dumb.

However, ollama does a lot of things right and it makes it easy to get up and running. VLLM, SGLang, Mistral.rs and even llama.cpp require a lot more work to setup.

2 comments

rahimnathwani 409 days ago

Sorry, I should have been clearer.

I meant when you download a gguf file from huggingface, instead of using a model from ollama's library.

link

monkmartinez 409 days ago

ollama pull hf.co/unsloth/Qwen3-30B-A3B-GGUF:Q4_K_M and the modelfile comes with it. It may have errors in the template or parameters this way. It has to be converted to GGUF/GGML prior to using it this way. You can, of course, convert and create the specific ollama model from bf16 safetensors as well.

link

rahimnathwani 409 days ago

Yeah when I do this, the modelfile has only FROM and TEMPLATE. No PARAMETERs:

  ollama pull hf.co/jedisct1/MiMo-7B-RL-GGUF:Q4_K_M
  ollama show --modelfile hf.co/jedisct1/MiMo-7B-RL-GGUF:Q4_K_M

link

o11c 409 days ago

Pretty sure the whole reason Ollama uses raw hashes everywhere is to avoid copying the whole NN gigabytes every time.

link

monkmartinez 409 days ago

Maybe I am doing something wrong! When I change parameters on the modelfile, the whole thing is copied. You can't just edit the file as far as I know, you have to create another 38GB monster to change num_ctx to a reasonable number.

link

o11c 409 days ago

The parameters (prompt, etc.) should be set only in the new modelfile (passed to `ollama create`), using a FROM referencing the previous ollama model. Parameters in a Modelfile override the hard-coded parameters from the GGUF itself (which are sometimes buggy); in fact from elsewhere in the thread it sounds like Mimo is missing proper stop tokens, or maybe templates in general; I'm not an expert).

This will show a separate entry in `ollama list` but only copy the Modelfile not the GGUF.

Alternatively, if you use the API, you can override parameters "temporarily". Some UIs let you do this easily, at least for common parameters.

link