| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by andy_ppp 32 days ago
	Fine tuning these models (at least with PPO or equivalent) requires even more VRAM than inference does, potentially 2-3 times more.

1 comments

You could use PEFT? Operating on only a subset of weights is fairly standard practice nowadays …

Yes I used LoRA and it’s fine but I’m not convinced the model doesn’t end up more stupid and less general