Hacker News new | ask | show | jobs
by hedgehog 111 days ago
There are a bunch of tutorials on how to use GRPO to fine tune a small Qwen. Depending what you're doing LoRA or even just prefix tuning can give pretty good results with no special hardware.