Hacker News new | ask | show | jobs
by thangngoc89 801 days ago
training on MPS backend is suboptimal and really slow.
1 comments

Do people do training on systems this small, or just inference? I could see maybe doing a little bit of fine-tuning, but certainly not from-scratch training.
If you mean train llama from scratch, you aren't going to train it on any single box.

But even with a single 3090 you can do quite a lot with LLMs (through QLoRA and similar).

Yep. Price/performance of multiple 4090s system are way better than the professional cards (Axxx). Also deep learning outside of LLM has many different usage.