| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by 0x00cl 350 days ago
	I see you are using ollamas ggufs. By default it will download Q4_0 quantization. Try `gemma3:270m-it-bf16` instead or you can also use unsloth ggufs `hf.co/unsloth/gemma-3-270m-it-GGUF:16` You'll get better results.

2 comments

simonw 350 days ago

Good call, I'm trying that one just now in LM Studio (by clicking "Use this model -> LM Studio" on https://huggingface.co/unsloth/gemma-3-270m-it-GGUF and selecting the F16 one).

(It did not do noticeably better at my pelican test).

Actually it's worse than that, several of my attempts resulted in infinite loops spitting out the same text. Maybe that GGUF is a bit broken?

link

danielhanchen 349 days ago

Oh :( Maybe the settings? Could you try

temperature = 1.0, top_k = 64, top_p = 0.95, min_p = 0.0

link

canyon289 349 days ago

Daniel, thanks for being here providing technical support as well. Cannot express enough how much we appreciate your all work and partnership.

link

danielhanchen 349 days ago

Thank you and fantastic work with Gemma models!

link

simonw 349 days ago

My topping only lets me set temperature and top_p but setting them to those values did seem to avoid the infinite loops, thanks.

link

danielhanchen 349 days ago

Oh fantastic it worked! I was actually trying to see if we can auto set these within LM Studio (Ollama for eg has params, template) - not sure if you know how that can be done? :)

link

JLCarveth 349 days ago

I ran into the same looping issue with that model.

link

danielhanchen 349 days ago

Definitely give

temperature = 1.0, top_k = 64, top_p = 0.95, min_p = 0.0

a try, and maybe repeat_penalty = 1.1

link

Patrick_Devine 349 days ago

We uploaded gemma3:270m-it-q8_0 and gemma3:270m-it-fp16 late last night which have better results. The q4_0 is the QAT model, but we're still looking at it as there are some issues.

link