Hacker News new | ask | show | jobs
by codercotton 2027 days ago
I thought the M1 GPU alone was about as powerful as a 1080. Wouldn't the Neural Engine cores make it faster than a 1080 for ML?
7 comments

Is it possible you meant the gtx 1650/1050 Ti mobile GPU like mentioned in Tom's hardware article: https://www.tomshardware.com/news/apple-silicon-m1-graphics-...

They have comparable TFLOPs range.

1080 mobile is way faster than 1060 (60-90% faster [^1]) and definitely ahead of M1 by a large factor.

[1]: https://gpu.userbenchmark.com/Compare/Nvidia-GTX-1080-Mobile...

M1 GPU has 2.6TFlops for the GPU (FP32). 1080Ti has 11.3TFlops.

M1 GPU isn't really that powerful, it's comparable to nVidia 760 (from 2013). The M1's Neural Engine does have more kick of course, but the GPU otherwise is nothing superb (other than marketing).

While I agree in general, I do want to point out this is a still a lightweight entry level laptop SoC compared to a desktop GPU you've mentioned. It can also run in fanless mode like in MBA m1.

A more modern comparison with mobility GPU would be GTX 1050 Ti mobility, which is around 10-20% faster than 760: https://gpu.userbenchmark.com/Compare/Nvidia-GTX-760-vs-Nvid...

Even still, 1050 ti uses up around 75w of power (2016) and 760 has a TDP of 170W (2013) while M1 GPU is much less than 10W (at full load, it peaks at 16w in Mac mini for the entire SoC, not just GPU).

It would be interesting to see what Apple does when it scales it up to 75w or more for their own custom desktop GPUs which is rumored in development. However, separate desktop GPU does lose the benefits of UMA that makes M1 fast.

But they created their GPU on world-leading TSMC 5nm. I wonder that M1's GPU perf/watt is mostly came from process advantage. I'd like to see comparison with Kirin 9000. (maybe published by AnandTech)
I mean, my entry deep learning machine was a refurbished Linux box with a pile of gpus I found in the electronics recycling bin at work. Later upgraded to a 1080 and then a 2080, as one does.

The nice thing about the desktop is that it can just train for days and I don't need to work about using it for other things, losing time moving locations, etc. It's also still probably cheaper than the m1 laptop, even with a nicer gpu than what you can find in the trash.

I did the same (well, used a discarded GPU that my kids didn't need any more) and I have one caution: If you're using PyTorch, you'll want to have a CPU that at least supports AVX. The C++ libraries that ship with PyTorch assume AVX at compile time and don't have an option to disable at runtime. It's a PITA to recompile the entire stack.
Not sure what you are talking about. Pytorch binaries don’t assume anything about your CPU.
When it comes to Apple's mobile GPUs (and now M1) their main advantage is that they have a custom architecture and compiler and they're both the result of lots of experts working hard on both and leveraging existing work (in the architecture case, the fact that they started from PowerVR's world-class design, and in the compiler case, the fact that their shader compiler is based on LLVM). They're really well-designed chips. Unfortunately there's no particular reason that they would be better at ML than any other GPU, since they had no reason to consider it in the design process until somewhat recently.
The M1 gpu is able to do 2.5 tflops while 1080 is said to be able to do 8 tflops.

From Geekbench it also looks like the m1 gpu is about 1/4-1/3 as powerful as a 1080.

The m1 may benefit from faster ram and shared memory though.

Apple states that the neural engine is able to do about 11 trillion operations per second (but oddly enough, they don’t report tflops).

Also, besides memory speed which certainly will be an important factor, it might depend on the model architecture, fp32/fp16 etc. Much like you cannot say CPU 1 is faster than CPU. 2 - it totally depends on the type of workload - same goes for deep learning benchmarks.
Performance in games typically doesn't have much correspondence with how good a GPU will be at compute. Rendering games, videos, etc is largely constrained by things like memory bandwidth, culling, image decoding and blending - all stuff typically done with special-purpose hardware that won't be very useful for compute or neural nets. Sometimes games or media have complex shaders that are entirely limited by compute, but it's not terribly common.

Apple's dedicated ML hardware is probably quite good, but we don't have any way to know how good without doing math on die size + power draw and running benchmarks.

1080? 1080 is an high end GPU that cost 700$.
The price doesn't mean much, the 1080 is pretty old at this point and the M1 has considerably smaller transistors. Considering how expensive Apple products are it's quite possible the cost of an M1 chip isn't much lower than that of the core shipped inside the 1080. A lot of what you pay for on a 1080 is cooling, display output, and power delivery.
Indeed, some of the cost of a 1080 at this point may be that the demand is constrained to people that want replacement parts for uniform deployments...
You may be thinking of the benchmarks that ranked it alongside a 1050 Ti
Afaik the new tf alpha for m1 .Mac does not make use of the neural cores.