Y
Hacker News
new
|
ask
|
show
|
jobs
by
andy_ppp
32 days ago
Fine tuning these models (at least with PPO or equivalent) requires even more VRAM than inference does, potentially 2-3 times more.
1 comments
rusk
31 days ago
You could use PEFT? Operating on only a subset of weights is fairly standard practice nowadays …
link
andy_ppp
31 days ago
Yes I used LoRA and it’s fine but I’m not convinced the model doesn’t end up more stupid and less general
link