| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by arkmm 462 days ago
	There are some options out there, depending on what type of task you're trying to fine tune. I think RL finetuning for DeepSeek e.g. isn't well developed yet, but you can finetune a small LLama model (~3B params) for classification or extraction tasks and it works really well. What sort of tasks were you looking at finetuning for?

1 comments

Code generation or question answering. But ideally 70+B