|
|
|
|
|
by kkielhofner
1150 days ago
|
|
Looking at what they're doing here probably not as much as you think. As you note, with the plethora of open/open-ish LLMs today and LoRA + PEFT you can fine tune with low VRAM and pretty quickly so even a single A100 or whatever cloud GPUs are just fine. I've even seen people pull it off in reasonable time on super cheap T4s, A10s, etc. I doubt anyone reading a blog post is attempting to train a "true" multi-billion param LLM from scratch. |
|