| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by thangngoc89 801 days ago
	training on MPS backend is suboptimal and really slow.

1 comments

wtallis 801 days ago

Do people do training on systems this small, or just inference? I could see maybe doing a little bit of fine-tuning, but certainly not from-scratch training.

link

redox99 801 days ago

If you mean train llama from scratch, you aren't going to train it on any single box.

But even with a single 3090 you can do quite a lot with LLMs (through QLoRA and similar).

link

thangngoc89 800 days ago

Yep. Price/performance of multiple 4090s system are way better than the professional cards (Axxx). Also deep learning outside of LLM has many different usage.

link