| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by accrual 341 days ago
	I wondered similar. Perhaps a local model cached in a 16GB or 24GB graphics card would perform well too. It would have to be a quantized/distilled model, but maybe sufficient, especially with some additional training as you mentioned.

2 comments

If Qwen 0.6B is suitable, then it could fit in 576MB of VRAM[0].

16Gb is way overkill for this.