| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jszymborski 338 days ago
	I wonder how well suited some of the smaller LLMs like Qwen 0.6B would be suited to this... it doesn't sound like a super complicated task. I also feel like you can train a model on this task by using the zero-shot performance of larger models to create a dataset, making something very zippy.

1 comments

accrual 338 days ago

I wondered similar. Perhaps a local model cached in a 16GB or 24GB graphics card would perform well too. It would have to be a quantized/distilled model, but maybe sufficient, especially with some additional training as you mentioned.

link

jszymborski 338 days ago

If Qwen 0.6B is suitable, then it could fit in 576MB of VRAM[0].

https://huggingface.co/unsloth/Qwen3-0.6B-unsloth-bnb-4bit

link

numpad0 337 days ago

or on a single Axera AX630C module: https://www.youtube.com/watch?v=cMF6OfktIGg&t=25s

link

otabdeveloper4 338 days ago

16Gb is way overkill for this.

link