| If you ollama pull <model> the modelfile will be downloaded along with the blob. To modify the model permanently, you can copypasta the modelfile into a text editor and then create a new model from the old modelfile with the changes you require/made. Here is my workflow when using Open WebUI: 1. ollama show qwen3:30b-a3b-q8_0 --modelfile 2. Paste the contents of the modelfile into -> admin -> models -> OpenwebUI and rename qwen3:30b-a3b-q8_0-monkversion-1 3. Change parameters like num_gpu 90 to change layers... etc. 4. Keep | Delete old file Pay attention to the modelfile, it will show you something like this: # To build a new Modelfile based on this, replace FROM with:
# FROM qwen3:30b-a3b-q8_0 and you need to make sure the paths are correct. I store my models on a large nvme drive that isn't default ollama as an example of why that matters. EDIT TO ADD:
The 'modelfile' workflow is a pain in the booty. It's a dogwater pattern and I hate it. Some of these models are 30 to 60GB and copying the entire thing to change one parameter is just dumb. However, ollama does a lot of things right and it makes it easy to get up and running. VLLM, SGLang, Mistral.rs and even llama.cpp require a lot more work to setup. |
I meant when you download a gguf file from huggingface, instead of using a model from ollama's library.