| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by blackeyeblitzar 783 days ago
	I can understand the inference part being useful and practical for Apple devs. I’m just wondering about the training part, for which there Apple silicon devices don’t seem very useful.

2 comments

spmurrayzzz 782 days ago

My M2 Max significantly outperforms my 3090 Ti for training a Mistral-7B LoRA. Its sort of a case-by-case situation though, as it depends on how optimized the CUDA kernels happen to be for whatever workload you're doing (i.e. for inference, theres a big delta between standard transformers vs exllamav2, apple silicon may outperform the former, but certainly not the latter).

link

rgbrgb 783 days ago

I’ve seen several people fine tune mistral 7B on MacBooks.

link