The exact same binary saw crushing victors in different directions, though, depending on what it was testing. Look at the GNU Radio results for example.
It's not the exact same binary - that's the point. It is different instruction sets, with potentially different optimisations. They're even compiled with different compilers - what looks like a GCC 12.1.1 snapshot for x86, and GCC 12.1.0 for ARM64.
It might be the same C, it might have hand-coded assembly for important bits in x86 but not in ARM, or vice versa, it might be just one specific algorithm executes particularly well on one CPU rather than the other, it might be that the slightly different version of GCC did a new optimisation.
edit: they're also comparing an actively cooled laptop to a passive one - so you would expect M2 to throttle in longer benchmarks, for extra distortion.
The methodology is flawed. It lets you cherry pick some individual results, and if your particular usecase is in there, great. But you don't know what state the M2 was in when a test started (eg if it was already hot and throttling, etc).
It's basically impossible to draw any useful generalised conclusions from these benchmarks.
Picking individual use cases is exactly how you should be reviewing benchmarks. You should be picking a laptop based on what you do. If you don't do certain things, why would you care that some other CPU is faster at what you are not doing?
If you're looking for a "general" comparison, there is none. General usage of a laptop computer for who? What do you consider general usage? What exactly are you looking for? For 99% of what people do, they won't even be able to tell apart a Celeron from an M2.
Why do you think we have things like discreet GPUs? You buy certain hardware for certain tasks.
Or do you just want to say that your CPU is better than someone else's? Who gives a shit? That's really all you get from a "general" performance review, a bunch of vague crap.
This review was great. It shows what Linux users can expect under certain workloads on two laptops that cost the same amount of money. You can then decide which one is best for you as a Linux user who may be interested in an M2.
Drawing general comparisons is exactly what this article tries to do at the end, that's one of my concerns with it.
They don't know what state the M2 was in at the start of each test (because the hw monitoring support isn't there yet), and this is a system that is known to thermally throttle, so the individual results are potentially flawed too.
The laptops don't cost the same amount. This is comparing an $1100 laptop to an $1800 laptop.
FWIW, I don't actually care which is faster (I own both AMD and Apple hardware that I use for different things) - I just think the review is flawed.
> It's not the exact same binary - that's the point
I think you misunderstood. "GNU Radio" on M2 was sometimes screamingly fast, and sometimes embarrassingly slow, depending on which test it was.
I'm not saying that "GNU Radio" was the same on M2 and AMD, obviously it's not. I'm saying the performance results for GNU Radio specifically were insanely inconsistent - M2 won by a stupidly huge amount in one of the GNU radio results, and AMD by an equally absurd margin in the other GNU radio results.
Same "GNU Radio" compiled binary on the respective platforms, huge swings in performance depending on what that binary was doing.
There were a couple other similar examples, where performance for the same program swung wildly depending on the exact task.
That's not necessarily unexpected and still follows most of what I've said - if one particular algorithm is hand-optimised and one isn't, you will see wild swings - especially if one has been hand-vectorised and one is failing to auto-vectorise during compilation.
It might be the same C, it might have hand-coded assembly for important bits in x86 but not in ARM, or vice versa, it might be just one specific algorithm executes particularly well on one CPU rather than the other, it might be that the slightly different version of GCC did a new optimisation.
edit: they're also comparing an actively cooled laptop to a passive one - so you would expect M2 to throttle in longer benchmarks, for extra distortion.
The methodology is flawed. It lets you cherry pick some individual results, and if your particular usecase is in there, great. But you don't know what state the M2 was in when a test started (eg if it was already hot and throttling, etc).
It's basically impossible to draw any useful generalised conclusions from these benchmarks.