Hacker News new | ask | show | jobs
by MBCook 784 days ago
There are an insane number of Apple Silicon devices out there.

If your product runs on an iPhone or iPad, I’m sure this is great.

If you only ever want to run on 4090s or other server stuff, yeah this probably isn’t that interesting.

Maybe it’s a good design for the tools or something, I have no experience to know. Maybe someone else can build off it.

But it makes sense Apple is releasing tools to make stuff that works better on Apple platforms.

1 comments

I can understand the inference part being useful and practical for Apple devs. I’m just wondering about the training part, for which there Apple silicon devices don’t seem very useful.
My M2 Max significantly outperforms my 3090 Ti for training a Mistral-7B LoRA. Its sort of a case-by-case situation though, as it depends on how optimized the CUDA kernels happen to be for whatever workload you're doing (i.e. for inference, theres a big delta between standard transformers vs exllamav2, apple silicon may outperform the former, but certainly not the latter).
I’ve seen several people fine tune mistral 7B on MacBooks.