|
|
|
|
|
by cdavid
305 days ago
|
|
I was surprised at previous comparison on omarchy website, because apple m* work really well for data science work that don't require GPU. It may be explained by integer vs float performance, though I am too lazy to investigate. A weak data point, using a matrix product of N=6000 matrix by itself on numpy: - SER 8 8745, linux: 280 ms -> 1.53 Tflops (single prec)
- my m2 macbook air: it is ~180ms ms -> ~2.4 Tflops (single prec)
This is 2 mins of benchmarking on the computers I have. It is not apple to orange comparison (e.g. I use the numpy default blas on each platform), but not completely irrelevant to what people will do w/o much effort. And floating point is what matters for LLM, not integer computation (which is what the ruby test suite is most likely bottlenecked by) |
|
Apple M chips are slower on the computation that AMD chips, but they have soldered on-package fast ram with a wide memory interface, which is very useful on workloads that handle lots of data.
Strix halo has a 256-bit LPDDR5X interface, twice as wide as the typical desktop chip, roughly equal to the M4 Pro and half of that of the M4 Max.