Hacker News new | ask | show | jobs
by 1_player 1546 days ago
Intel has been sitting twiddling its thumbs for years, so it was easy for Apple to actually innovate and create an incredible processor, also controlling the entire stack and being able to switch from x86 helps.

But on the GPU side NVIDIA and AMD have been battling it out for a while, competition is tight, and thinking Apple could come, compete and actually beat the fastest GPU currently available for consumers is a bit of a pipe dream, in hindsight. The GPU market hasn't been sitting on laurels for as long as Intel has been. That said, I am still positively surprised at how close they managed to reach the 3090, but alas, no cigar.

6 comments

Beside "being able to switch from x86", there are some other factors that make this feat look easier than it actually is:

- building on the ARM architecture, which (1) saved them a lot of design costs and (2) being a simpler architecture already has a "built-in" performance advantage over x86.

- having years of experience already with the "A-series" of chips used in iPhones and iPads since 2010. Everyone is talking about "Apple silicon" now, but these predecessors are often forgotten.

- having a privileged partnership with TSMC, where they have access to the latest and best processes and priority over all other TSMC customers.

You're forgetting the most important one:

- This is not heir first rodeo. They've migrated from PowerPC to x86 and developed first Rosetta, UniversalBinary format and related machinery.

They've tapped into that experience, and it went way way smoother.

Just within Mac they've gone from 680x0 to PowerPC (related to POWER but separate), to AMD64, to Apple Silicon. Before the Mac, they also had the 6502 and 6800 platforms.
They went 32 bit x86 before adding support for amd64 bit and then eliminating 32 bit right before going to apple silicon (presumably in part motivated by rosetta 2 being optimized for translating amd64 software)
You're right. I'd forgotten about that. I haven't forgotten about the weird 64-bit-CPU-with-32-bit-EFI snafu, though. That was a pain.
However, as far as I can remember, they didn’t had to make a transition plan like they did in Intel transition.

They had to research how to make it completely seamless that time, for the first time.

That was like 20 years ago. I wonder how many of those folks are still around. It would be interesting to understand how they ran this project internally. Did it build off of that previous work or did they approach it from the ground up.
It's a small nitpick but I think technically they moved from PowerPC, not POWER.
Yeah, you’re right, sorry. Fixed it. :)
The mere fact that Apple's claims to challenge NVidia's best weren't dismissed with a laughable graph, and instead were concluded to be "well, a bit of a marketing stretch, but not a bad attempt Apple"...

When Intel has been failing at discrete graphics competition with ATI/NVidia for... two decades? Yet Apple cranks out a functional competitor a year after their first desktop chip. Yes they probably have lots of experience with integrated graphics on SOC for the iPhones so this isn't totally out of the blue. But still.

What has Intel been doing all these years? Is it THAT dysfunctional? Add to the failures of VLIW/Itanium, loss of the SSD crown, underwhelming XPoint, no ARM competitors, no real position in mobile, twice getting passed up by AMD in the performance race, a company with 1/10th, possibly 1/100th the resources.

Intel did make GPU recently, I think? I though they have at least improved from "Just don't play games on this" to "Plays if it isn't power hungry". Consider they are not really doing anything about gaming in the past 10(?) years. It is a decent improvement.
Do you realize that Apple silicon is the result of decades of chip design experience? They didn't just decide to do it Q3 next year, one day.

Intel rested on their laurels because AMD almost wiped themselves out, which led to them getting complacent with the process. The actual designs were quite good, it took AMD a few years of better process and newer designs to properly dethrone intel (which intel have now reclaimed in the mid range, this will now yo-yo as it should always do)

> Intel rested on their laurels because AMD almost wiped themselves out

You mean, Intel almost illegally wiped AMD out using its monopoly position in the x86 market:

https://www.amd.com/en/corporate/antitrust-ruling

Bulldozer was terrible
The meltdown/spectre mitigations have wiped out most of the gap between bulldozer chips and the contemporary Intel ones.

Did Bulldozer really lose or was the competition cheating?

> Did Bulldozer really lose or was the competition cheating?

Bulldozer and it’s competition were both designed at a time where leaking information to another process on the same physical core was outside the threat profile CPU memory protection was designed to mitigate against. Intel was optimizing aggressively within the envelope of expectations at the time, which were upended by the rise of cloud computing.

Big citation needed there.

Spectre mitigations shouldn't effect quite a lot of programs because there's nothing to mitigate within the same address space.

https://www.phoronix.com/scan.php?page=article&item=spectre-...

Worst case I see is approximately 33% performance penalty in aggregate for Xeon parts, with much worse performance in specific scenarios. Comparing the original Bulldozer benchmarks, this does close the gap like GP suggested.

Also, refresh your memory on how the mitigations work because there’s definitely an impact for most programs.

Hard to innovate properly when your income stream is artificially restricted.
Then how did Ryzen work?
Semi-custom (aka console) chips kept them afloat, and they executed well on Ryzen while Intel made modest year-over-year improvements. Doesn't mean they weren't working on a relatively tight R&D budget, though.
Ryzen worked in part because it didn't try to be super-clever. Bulldozer was very opinionated about how computing would look (cheap cpu + big coprocessor) whereas Zen is much more practical
Large part, likely the majority, of Intel's CPU dominance had been a result of their "unfair" advantage in fabs, not in hardware design. Often quoted rule of thumb - 90%/10% - 90% of the progress in semiconductors is attributable to process shrinkage, 10% to improved the hardware design. TSMC contributed the lion's share in making AMD competitive again. Design-wise, chips from Intel, AMD and Apple are all incredibly impressive. Afaik, Apple pays the top dollar to get access to the highest-end TSMC process - any performance comparisons should keep that in mind
I’d be very interested to know how much licensing costs might be, given the co-dependence there is between Intel and AMD in that space
My understanding is that everything was pretty much wrapped up with the quid pro quo settlement where Intel allows AMD an x86 license and AMD allows Intel a license the x64 extensions, and it also applies to extensions since then like AVX.

Here's the terms: https://www.sec.gov/Archives/edgar/data/2488/000119312509236...

I don't think any money actively changes hands on an ongoing basis for this, but when this was signed Intel did pay $1.25 billion and AMD agreed to drop anti-trust complaints against Intel.

Absolutely. I wasn't expecting Apple to beat any decent current GPU with the performance.

What I like about the M1 is that I get 60-70% of the performance with a chip that is much smaller and cooler than the GPUs it beats. I think for a long time we were thinking about computers as big bulky noise heat generators and Apple just shows the way that we can achieve roughly the same in a much smaller equipment that is much cooler (in terms of temperature).

This is why M1 is a breakthrough for me.

> a chip that is much smaller and cooler

I assume you mean "enclosure" that's smaller?

M1 Ultra is 114 billion transistors, while the RTX 3090 is 28 billion. (Not sure of physical size of the chips, though - do you have a reference for that?)

Original point stands, of course, on efficiency/heat!

> Not sure of physical size of the chips, though - do you have a reference for that?

Anandtech has M1 Max at 432 mm2, Wikipedia has 3090 at 628 mm2.

So, since M1 Ultra is 2x M1 Max, it is physically larger than the 3090.

If that's for the full chip M1 chip, not just GPU, we also need to consider the i9 transistors on the benchmark.
The last time Intel released transistor count was for the i9-7980XE. It's a physically larger 18 core chip on the same process node as the 10 core 10900K tested, and googling says it had 9 billion transistors. If we assume that their improvements on 14nm let them fit all the same chips as the larger CPU into the newer one, then.... the i9 transistor count doesn't make a meaningful difference to the comparison.
> Intel has been sitting twiddling its thumbs for years,

What evidence do you have for that?

Intel really wanted to release (in my opinion) very interesting processors, but had serious problems over years with getting their 10 nm process "stable" (i.e. sufficiently high and reproducible yield rates). They even back-ported some of their planned CPU architectures to their working 14 nm process "just to get something out". Since these processors were not developed with 14 nm in mind, they of course were not as good as they could have been.

> Intel really wanted to release (in my opinion) very interesting processors...

No. Intel just wanted to show some little improvements to keep performance gap constant, hide their neat tricks until competition catch up, and use them when clients or the market really demanded it.

The only notable efforts I've seen were reducing performance penalty of SpeedStep performance switching, making better memory controllers to "Catch" AMD, and other power-gating and independent throttling capabilities to address density issues in systems.

When fab/power/thermal issues became apparent, they started to hide AVX/AVX2 frequencies, created frankenprocessors for some applications, etc.

However, I've seen no real effort to make groundbreaking innovations in x86 space rather than protecting what they already had.

Performance counters, and other underlying piping to make processor observable was nice though.

As a result, I can still use a 3rd generation i7 as a daily driver for almost all tasks at hand, including development. The only definitive performance difference shows itself when I run my scientific code after compiling it with platform platform specific optimizations on newer systems. On that regards, an M1 MacBook air can be 25% faster than a 7th gen 7700K processor, and I find it ironic.

> However, I've seen no real effort to make groundbreaking innovations in x86 space rather than protecting what they already had.

I consider what AVX-512 has to offer to be highly innovative.

Unluckily, just when they planned to introduce AVX-512 into most desktop/laptop CPUs (not just server CPUs or special-purpose accelerators), the problems with 10 nm occured. So this was delayed a lot and even today, many desktop/laptop CPUs of Intel have no support for this feature.

Also Intel TSX was in my opinion really innovative (even though this feature was to my knowledge mostly used in (business) databases; what a pity).

I wouldn't call wider SIMD lanes terribly innovative. Particularly when they suffer from power costs to evaluate, time penalties just to fill the registers with enough data from cache or memory, and real workloads don't benefit from SIMD as much in practice when compilers are terrible at autovectorization (and humans are only marginally better at doing it manually).

AVX-512 is an example of a feature that improves special cases that show up in faux-workloads (eg: fancy benchmarks and HPC) but does not manifest higher performance for the vast majority of workloads, including things that ostensibly should be embarrassingly parallel and reap gains from SIMD.

SIMD lines are just a miniaturization of older vector processors as co-processors, a-la CRAY in a box.

As an HPC sysadmin and scientific software developer/researcher, I can confidently say that SIMD can provide real performance gains, however there are trade-offs and decisions to be made.

- First of all, SIMD is very data-hungry. You either need to constantly push data into it, or modify the data you've pushed a lot. Otherwise you just sit.

- Then there comes power and frequency penalty. In Intel's case, it needs humongous amount of power in CPU budget terms, and it creates heat and slowdowns. So you have to test your code with SIMD or without it (-mtune, -march, etc.). If your code is as speedy or faster, use SIMD.

- Moreover, you can't just compile an extremely optimized binary and fan it out. Older processors will just throw "illegal instruction" and halt. You either will provide multiple binaries with specific optimizations for each, or lowest common denominator for a vendor (AMD binary and Intel binary), or just throw all out. The best way is giving the source out and providing a simple makefile to let the researcher/user compile it, but not all code is open, one may guess. Creating a universal with multiple code paths is also possible, yet needs a lot of elbow grease, and may not be always optimal.

- Lastly, your code don't have to be embarrassingly parallel to be able to use SIMD. Matrix/linear algebra libraries like Eigen can almost abuse the processor's all units when compiled with correct flags (-O3, -mtune=native, -march=native). However, if you want to accelerate small data with SIMD, you need to create a parallel loop which needs to saturate SIMD pipelines. Which OpenMP can easily do with parallel_for.

All of this doesn't change that SIMD is a special horse which can't run in all courses, however its not useless.

I didn't say it was useless, just that it wasn't a magic bullet and AVX-512 isn't particularly innovative, and doesn't solve most users' problems.

I think you're missing the point of my post, I agree with all your points in specificity (except one, but not the forum to discuss FMV in modern compilers) but they miss the grander point that Intel hasn't made computers faster via more SIMD. The amount of expertise required to make use of it is just more evidence of that.

AVX512 was clearly a great innovation in the vectorization landscape. A far cleaner instruction set, complete and symmetric, with very interesting blend, ternlog, lane-crossing instructions and the especially interesting mask registers. Lots and lots of goodies and an eye for compiler implementation.

I feel Intel failed hard at diffusion of the ISA (why not put it everywhere, with half-perf, it'll improve later, no change in code) and also at not pushing more energy/dollars into ispc. Yeah yeah your compiler engineers are clever, but you've been doing this for 20 years and autovectorization is still ways off. Let me write code in a way that can be easily vectorized. A subset of C. Less awkward than cuda.

Now it seems AVX512 and large vector units is dying and still is too niche. Sad.

The cleanup being tied to the width increase was the first problem. The new width still being a fixed one was the second.

SVE is SIMD actually done right – on the Arm side in the near future, everything from smartphones to massive HPC boxes will be covered by the same clean SIMD ISA.

We spent years with quad core i7 processors being the norm, with higher core count processors being locked to Intels HEDT platform. When Ryzen came out all of a sudden the i7 8700k was able to be a 6 core processor instead of a 4 core. Then it wasn’t until Alder lake released in November of last year that we finally got desktop processors that weren’t on 14nm+++ (or however many + it was at). That’s not including the fact that you could overclock all the initial Ryzen processors, while Intel locked you to specials skus of CPU and motherboard
It’s basically innovation 101: don’t spend more money developing something if you know you customers needs are already met.

They likely knew what apple were up to with efficiency cores several years ago, and only decided to accelerate manufacturing of Alder lake once they realised the market was cool with that form or architecture.

"Innovation 101: don't innovate if there's no demand for" doesn't sound like innovation to me.
Putting effort where demand is, doesn't seem like the stupidest strategy to me. Creating demand for a new kind of product or service is nice, but reaching a market where their needs are, seems clever.

What baffles me about this 'innovate on something else than peak perf' is... What did they innovate on massively instead then, if not that? Apart from AVX512?

> Putting effort where demand is, doesn't seem like the stupidest strategy to me.

It can indeed be a sensible business strategy, but it will much less likely lead to remarkable innovations.

Intel did have any major advances from 2nd gen all the way until the 7th. The advancements were generally small (single-digit often) IPC or clock speed improvements.

Only after AMD released Ryzen, Intel had to respond at their 8th gen by cranking up core counts. And IPC did not have any increases, until 11th gen(the backported arch you mentioned). In my opinion, the performance delta between 28nm Sandy Bridge and 14nm Kaby Lake is ridiculously small.

> In my opinion, the performance delta between 28nm Sandy Bridge and 14nm Kaby Lake is ridiculously small.

Two of my Linux workstations are Sandy Bridge and Kaby Lake, and my real world experience bears this out. I can't distinguish between the two for everyday use cases; only synthetic benchmarks show any real advantage in the newer system. I can't speak for Windows performance differences, as my only Windows system is my gaming rig with a Ryzen 5 3600, which of course trounces both of the workstations no matter the OS.

Are there still software features exclusive to Intel's chips? I remember my dabble in Android Studio was painful, since the 'simulated device' functionality was only available on Intel, while I had some FX-8??? AMD chip.

I'm not in the market for a new gaming PC quite yet, but it's also going to be a personal workstation, so I don't want to deal with anything like above if it can be helped.

Get a Ryzen. Intel is restricting many features like ECC, overclocking or virtualization (important for emulators/android development) to certain processors/chipsets. If you want to save money, get a used workstation.

ECC -> Xeon/C

OC -> K/Z

Virtualization -> Xeon&non-K/C&H

Intel caved a little, alder lake consumer chips have ECC ... if you use the right chipset:

https://www.tomshardware.com/news/intel-enables-ecc-on-12th-...

Looks like it's time to update the fantasy roster on pc part picker. A shame, since it had such a nice aesthetic! The Vision D looks so clean and well-featured

By the time I get around to actually building it, there may be some similarly-styled mini-ITX AM4 motherboards, so not all hope is lost.

Back in ~2012 I ran the Python Meetup here in Phoenix. One of the compiler leads for Atom processor attended and he basically told the tale that the Atom processor was being neutered. Intel higher ups were worried the Atom was getting too close to desktop/server processor performance. They were very concerned about cannibalization from within. Also, even then, Intel thought very little of mobile processors.
The classic tale of protect the dinosaur technology. Xerox Alto PC, Kodiac killing their own digital camera, etc.
That's disappointing if true. If you don't cannibalize your business, someone else will.
Interesting processors? Yes, Intel released IA64 with Itanium which was interesting and a dud in the market. Then they came up with Xeon Phi, which was a dud in the market. Then they brought Larabee, which was a failure in the market. Part of Larabee's issues did stem from process limitations. Intel had every opportunity to hedge its bets by buying from TSMC, Samsung, or others just as everyone did, but kept sinking more money into their own foundries without getting what they paid for.

Meanwhile AMD gave the market what customers clamored for: a 64-bit extension to the IA32 platform. Then AMD gave us massively performant APUs. Then AMD gave us multi-die packaging and left the IO die on a more sensible process for that function while using smaller processes where frequencies matter more.

Then Apple with some help from ARM gave us a SOC for laptop and desktop use that's frankly kind of embarrassing AMD and Intel, not so much for the core design really as its integration with memory.

Intel isn't just unlucky here. They've made a series of serious missteps going back a couple of decades now.

> Intel really wanted to release (in my opinion) very interesting processors, but had serious problems over years with getting their 10 nm process "stable" (i.e. sufficiently high and reproducible yield rates).

Intel was just doing what they have been doing for the last 40 years - building faster x86 CPU's.

They weren't even considering something as grand for desktop/laptops as Apple was with the M1 (i.e. a fully integrated SOC).

I don't think they were twiddling their thumbs to be fair - they were probably pushing hard in the same direction they have been pushing in for the last 40 years, but failed to see the industry change under their feet.

> They weren't even considering something as grand for desktop/laptops as Apple was with the M1 (i.e. a fully integrated SOC).

Perhaps I misinterpret your argument, but it is my impression that Intel (and also AMD!) did huge steps into that direction, just in a more incremental way than what Apple did:

- integrated GPU: check

- integrated memory controller: check

- integrated northbridge: check: https://en.wikipedia.org/w/index.php?title=Northbridge_(comp...

- developing a smartphone SoC: Intel did invest serious money into it and developed SoFIA and Broxton. Intel even strongly subsidized smartphone producers to use them. This all turned out to be a huge commercial failure and thus Intel left the smartphone SoC business.

It is also not the case that a fully-integrated SoC is "better". Rather having not everything in one SoC enabled much more flexibility for OEMs. Fully integrated SoC versus more chips is rather a trade-off between various goals.

Integrated GPU's have been around since 1991, so I wouldn't personally point to that as an example of Intel continuing to be highly 'incrementally' innovative.

Similarly the M1 chip came out of a smartphone SOC - it was just that Apple saw the potential for laptop/desktop adoption while Intel clearly didn't (maybe because they failed to get into the smartphone business - but their failure in that market is yet another example of 'too slow, too little, too late').

"Intel was just doing what they have been doing for the last 40 years - building faster x86 CPU's."

Yes Intel has been doing some minimum improvement to CPUs each year, but the reality of "Intel was just doing what they have been doing for the last 40 years - " is....

"Milking their Monopoly"

The AMD lawsuit with Intel back in the Netburst days showed how Intel was just as bad if not worse than Microsoft at anticompetitive behavior to lock out competitors in the PC market. I'll throw Intel a bone in that for decades with this monopoly power they still continued to push the process and design envelope (probably because they were afraid of becoming Motorola, and when they still had engineering leadership left over from the earlier days).

But Intel is a badly overfed Jabba the Hut. Gelsinger has his work cut out for him.

It's expected that they're trying to build chips that are faster on x86 rather than switch to an entirely new architecture - they can't switch without the full support of Microsoft and at least some major Linux distributions, not to mention the OEMs they sell their chips to.
They absolutely could have built an ARM chip.

The fact that Microsoft has released several ARM laptops but selected Qualcomm to provide the processors suggests that Intel at least had an opportunity to play the game, they just never came to the table. It’s hardly Microsoft slowing them down when Microsoft is ahead of Intel on this, but their reluctance to push forwards means that they are now behind.

It’s not Microsoft’s job to push intel anyway, it’s intels job to create a product so compelling that their partners adopt it. If they want the support of major Linux distros, they just have to write it themselves rather than wait for volunteers to do their work for them.

Yeah, I've always been pretty skeptical of the "Intel just sat doing nothing" narrative. The impression I got back in the day is that they went quite hard on their 10nm process with a reasonably ambitious set of changes, then failed to scale it to production. That had substantial knock-on effects including delays to the associated microarchitectures as well as subsequent nodes and backports.

Regardless of the reasons I hope they recover; multiple providers of cutting-edge fabrication technology will be essential.

Intel has absolutely been lazy and literally just gave 5% performance increase per year for quite some time. When AMD was making shitty processors, Intel was just trying to squeeze as much money as possible from marginal upgrades.

You can run Windows on a Intel cpu that is 10 years old and notice hardly any difference in performance.

And on the other hand, they changed everything. Their cpus are actually innovative and really fast, and brought the entire multi core thing into consumer hands in a real way.

They've been trying to move into other non-x86 architectures. Had they taken off they would have been in a more competitive position now.