| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by c0brac0bra 454 days ago
	What tasks have you found the 0.6B model useful for? The hallucination that's apparent during its thinking process put up a big red flag for me. Conversely, the 4B model actually seemed to work really well and gave results comparable to Gemini 2.0 Flash (at least in my simple tests).

2 comments

SparkyMcUnicorn 454 days ago

You can use 0.6B for speculative decoding on the larger models. It'll speed up 32B, but slows down 30B-A3B dramatically.

link

omneity 453 days ago

It's okay for extracting simple things like addresses, or for formatting text with some input data, like a more advanced form of mail merge.

I haven't evaled these tasks so YMMV. I'm exploring other possibilities as well. I suspect it might be decent at autocomplete, and it's small enough one could consider finetuning it on a codebase.

link