Hacker News new | ask | show | jobs
by arkmm 462 days ago
There are some options out there, depending on what type of task you're trying to fine tune. I think RL finetuning for DeepSeek e.g. isn't well developed yet, but you can finetune a small LLama model (~3B params) for classification or extraction tasks and it works really well. What sort of tasks were you looking at finetuning for?
1 comments

Code generation or question answering. But ideally 70+B