Hacker News new | ask | show | jobs
by giantg2 702 days ago
I was recently looking at building and buying a couple systems. I've always liked Intel. I went AMD this time.

It seemed like the base frequencies vs boost frequencies were much farther apart on Intel than with most of the AMDs. This was especially true on the laptops were cooling is a larger concern. So I suspect they were pushing limits.

Also, the performance core vs efficiency core stuff seemed kind of gimmicky with so few performance cores and so many efficiency cores. Like look at this 20 core processor! Oh wait, it's really an 8 core when it comes to performance. Hard to compare that to a 12 core 3D cached Ryzen with even higher clock...

I will say, it seems intel might still have some advantages. It seems AMD had an issue supporting ECC with the current chipsets. I almost went Intel because of it. I ended up deciding that DDR5 built in error correction was enough for me. The performance graphs also seem to indicate a smoother throughput suggesting more efficient or elegant execution (less blocking?). But on the average the AMDs seem to be putting out similar end results even if the graph is a bit more "spikey".

5 comments

> It seems AMD had an issue supporting ECC with the current chipsets.

AMD has the advantage with regards to ECC. Intel doesn't support ECC at all on consumer chips, you need to go Xeon. AMD supports it on all chips, but it is up to the motherboard vendor to (correctly) implement. You can get consumer-class AM4/5 boards that have ECC support.

> AMD supports [ECC RAM] on all chips

There was a strange happening with AMD laptop CPUs (“APUs”): the non-soldered DDR5 variants of the 7x40’s were advertised to support ECC RAM on AMD’s website up until a couple months before any actual laptops were sold, then that was silently changed and ECC is only on the PRO models now. I still don’t know if this is a straightforward manufacturing or chipset issue of some kind or a sign of market segmentation to come.

(I’m quite salty I couldn’t get my Framework 13 with ECC RAM because of this.)

> AMD supports it on all chips

Unfortunately not. I can't say for current gen, but the 5000 series APUs like the 5600G do not support ECC. I know, I tried...

But yes, most Ryzen CPUs do have ECC functionality, and have had it since the 1000 series, even if not officially supported. Official support for ECC is only on Ryzen PRO parts.

You need W680 boards (starting at around 500 bucks) for ECC on desktop intel chips.
I was seeing them around $400 (still expensive).
Actually some of the 13th and 14th gen Intel Core processors support ECC.
Intel has always had randomly supported ECC on desktop CPUs. Sometimes it was just a few low end SKUs, sometimes higher end SKUs. 14th gen it appears i9s and i7s do, didn't check i5s, but i3s did not.
My understanding is that it's screwed up for multiple vendors and chipsets. The boards might say they support it, but there are some updates saying it's not. It seemed extremely hard to find any that actually supported it. It was actually easier to find new Intel boards supporting ECC.
yeah wendell put out a video a few weeks ago exploring a bunch of problems with asrock rack-branded server-market B650 motherboards and basically the ECC situation was exactly what everyone warns about: the various BIOS versions wandered between "works, but doesn't forward the errors", "doesn't work, and doesn't forward the errors", and (excitingly) "doesn't work and doesn't even post". We are a year and a half after zen4 launched and there barely are any server-branded boards to begin with, and even those boards don't work right.

https://youtu.be/RdYToqy05pI?t=503

I don't know how many times it has to be said but "doesn't explicitly disable" is not the same thing as "support". There are lots of other enablement steps that are required to get ECC to work properly, and they really need to be explicitly tested with each release (which if it is "not explicitly disabled", it's not getting tested). Support means you can complain to someone when it doesn't work right.

AMD churns AGESA really, really hard and it breaks all the time. Partners have to try and chase the upstream and sometimes it works and sometimes it doesn't. Elmor (Asus's Bios Guy) talked about this on Overclock.net back around 2017-2018 when AMD was launching X399 and talked about some of the troubles there and with AM4.

That said, the current situation has seemingly lit a fire under the board partners, with Intel out of commission and all these customers desperate for an alternative to their W680/raptor lake systems (which do support ecc officially, btw) in these performance-sensitive niches or power-limited datacenter layouts, they are finally cleaning up the mess like, within the last 3 weeks or so. They've very quickly gone from not caring about these boards to seeing a big market opportunity.

https://www.youtube.com/watch?v=n1tXJ8HZcj4

can't believe how many times I've explained in the last month that yes, people do actually run 13700Ks in the datacenter... with ECC... and actually it's probably some pretty big names in fact. A previous video dropped the tidbit that one of the major affected customers is Citadel Capital - and yeah, those are the guys who used to get special EVEREST and BLACK OPS skus from intel for the same thing. Client platform is better at that, the very best sapphire rapids or epyc -F or -X3D sku is going to be like 75% of the performance at best. It's also the fastest thing available for serving NVMe flash storage (and Intel specifically targeted this, the Xeon E-2400 series with the C266 chipset can talk NVMe SAS natively on its chipset with up to 4 slimsas ports...)

it's somewhere in this one I think: https://www.youtube.com/watch?v=5KHCLBqRrnY

The new EPYC processors for AM5 though look like they'll be ok for ECC ram though, at least in the coming months onwards.
Yeah I think that’s the bright spot, now that there’s a branded offering for server-flavored Ryzen now maybe there is a permanent justification for doing proper validation.

I just feel vindicated lol, it always comes up that “well works fine for me!” and the reality is it’s a total crapshoot with even server-branded boards often not working. There is zero chance your gigabyte UD3 or whatever is going to be consistently supported across bios and often it will not be.

And AMD is really really tied to AGESA releases, so it’s fairly important on that side. Although I guess maybe we’re seeing now what happens if you let too much be abstracted away… but on the other hand partners were blowing up AMD chips last year too.

If you’re comfortable always testing, and always having the possibility of there being some big AGESA problem and ecc being broken on the new versions… ok I guess.

There is a reason the i3 chips were perennial favorites for edge servers and NASs. And I think it's really, really hard to overstate the long-term damage from reputation loss here. Intel, meltdown aside, was always no-drama in terms of reliability. Other than C2000/C3000, I guess.

...and puma and i-225V chipsets.

or at least... maybe on the CPU side they were no-drama. Other than C2000/C3000. Granted the powervr graphics on the atoms way back did suck... and meltdown... and avx-512 being rolled back... /phillip j fry counting on his fingers

maybe "blue-chip coded" is a better way to express it ig

but like, there is a notable decline in the quality of execution of intel overall, pretty much across the board, and cpu was always their core vertical, right? That was their business redoubt. intel is blue chip chips, especially CPUs. And now it's falling - really it's been falling for a while. Meltdown I can generally excuse (yes, shush), nobody appreciated sidechannels back then even if they were theoretically known. C2000/C3000 is another fuckup. yeah it's the super-io/serial bus controller... technically not their IP but it happens to be in a critical path, on their node, killing their processor. They fucked up the validation there, evidently.

I-225V had three steppings and I-226V is still not fully fixed (windows/linux have just turned off the EEE/802.11az feature instead). Puma was a god damned mess.

Sapphire rapids was late, still a huge mess, and actually the -W platform had not only insane power draw, but also insaner transients. 750W average, spiking up to 1500W under load, with pretty steep holdup requirements. And actually that was locked behind a "water cooled" bios option, the processor just "refused to all-core turbo" otherwise. And Intel didn't wanna actually say that the "water cooled" behavior was the spec or intentional turbo limits etc. In hindsight hmmm, that all took a bit of a different tone, didn't it?

Supposedly there is going to be a SPR-W refresh with a new stepping to fix this... emerald rapids is also very power-hungry and there were some unconfirmed murmurs suggesting it might have the same crash problems.

(yes, yes, please just listen to the guest here.) https://www.youtube.com/watch?v=_HJu5xt43iQ&t=3603s

https://wccftech.com/intel-xeon-w-3500-w-2500-sapphire-rapid...

Intel's in some real danger especially with AMD ascendant like this. Like it doesn't take very long of this real damage to customers etc and that "we're blue-chip!" thing will cease to be, and that is the last prop keeping intel's finances above the water here. Sure, it will take a while to fully wind down but... this is a great example of how intel's fuckups are driving their clients literally into the arms of the competition. A month or two ago, Asrock Rack didn't give a shit about the B650-2L2T or whatever. Guess what? Now Epyc Mini exists and oems are going to be paying attention to that. Oops.

ECC support wasn't good initially on AM5, but there are now Epyc branded chips for the AM5 socket which officially support ECC DDR5. They come in the same flavors as the Ryzen 7xx0 chips, but are branded as Epyc.
More E-core is reasonable for multi threaded application performance. It's efficient for power and die area as the name indicates, so they can implement more E-cores than P-cores for the same power/area budget. It's not suitable for who need many single threaded performance cores like VM server, but I don't know is there any major consumer usage requires such performance.
> but I don't know is there any major consumer usage requires such performance.

Gaming.

There are some games that will benefit from greater single-core performance.

I can sort of see that. The way I saw it explained as them being much lower clock and having a pretty small shared cache. I could see E cores as being great for running background processes and stuff. All the benchmarks seem to show the AMDs with 2/3rd the cores being around the same performance and with similar power draw. I'm not putting them down. I'm just saying it seems gimmicky to say "look at our 20 core!" with the implicit idea that people will compare that with an AMD 12 core seeing 20>12, but not seeing the other factors like cost and benchmarks.
It's the megahertz wars all over again!
Computers have taught is the rubric of truth: Numbers go Up.
> so few performance cores and so many efficiency cores

I was baffled by this too but what they don't make clear is the performance cores have hyperthreading the efficiency cores do not.

So what they call 2P+4E actually becomes an 8 core system as far as something like /proc/cpuinfo is concerned. They're also the same architecture so code compiled for a particular architecture will run on either core set and can be moved from one to the other as the scheduler dictates.

> They're also the same architecture so code compiled for a particular architecture will run on either core set and can be moved from one to the other as the scheduler dictates.

I don't know if that has done more good than harm, since they ripped AVX-512 out for multiple generations to ensure parity.

A major differentiator is that Intel CPUs with E cores don’t allow the use of AVX-512, but all current AMD CPUs do. The new Zen 5 chips will run circles around Intel for any such workload. Video encoding, 3D rendering, and AI come to mind. For developers: many database engines can use AVX-512 automatically.
> Like look at this 20 core processor! Oh wait, it's really an 8 core when it comes to performance.

The E cores are about half as fast as the P cores depending on use case, at about 30% of the size. If you have a program that can use more than 8 cores, then that 8P+12E CPU should approach a 14P CPU in speed. (And if it can't use more than 8 cores then P versus E doesn't matter.) (Or if you meant 4P+16E then I don't think those exist.)

> Hard to compare that to a 12 core 3D cached Ryzen with even higher clock...

Only half of those cores properly get the advantage of the 3D cache. And I doubt those cores have a higher clock.

AMD's doing quite well but I think you're exaggerating a good bit.

> If you have a program that can use more than 8 cores, then that 8P+12E CPU should approach a 14P CPU in speed

Only if you use work stealing queues or (this is ridiculously unlikely) run multithreaded algorithms that are aware of the different performance and split the work unevenly to compensate.

Or if you use a single queue... which I would expect to be the default.

Blindly dividing work units across cores sounds like a terrible strategy for a general program that's sharing those cores with who-knows-what.

It’s a common strategy for small tasks where the overhead of dispatching the task greatly exceeds the computation of it. It’s also a better way to maximize L1/L2 cache hit rates by improving memory locality.

Eg you have 100M rows and you want to cluster them by a distance function (naively), running dist(arr[i], arr[j]) is crazy fast, the problem is just that you have so many of them. It is faster to run it on one core than dispatch it from one queue to multiple cores, but best to assign the work ahead of time to n cores and have them crunch the numbers.

It has always been a bad idea to dispatch so naively and dispatch to the same number of threads as you have cores. What if a couple cores are busy, and you spend almost twice as much time as you need waiting for the calculation to finish? I don't know how much software does that, and most of it can be easily fixed to dispatch half a million rows at a time and get better performance on all computers.

Also on current CPUs it'll be affected by hyperthreading and launch 28 threads, which would probably work out pretty well overall.

> What if a couple cores are busy

If you don't pin them to cores, the OS is still free to assign threads to cores as it pleases. Assuming the scheduler is somewhat fair, threads will progress at roughly the same rate.

> run multithreaded algorithms that are aware of the different performance and split the work unevenly to compensate.

This is what the Intel Thread Director [0] solves.

For high-intensity workloads, it will prioritize assigning them to P-cores.

[0] https://www.intel.com/content/www/us/en/support/articles/000...

Then you no longer have 14 cores in this example, but only len(P) cores. Also most code written in the wild isn’t going to use an architecture-specific library for this.
The P cores being presented as two logical cores and E cores presented as a single logical core results in this kind of split already.
Yeah, the 20 core Intels are benchmarking about the same as the 12 core AMD X3Ds. But many people just see 20>12. Either one is more than fine for most people.

"Oh wait, it's really an 8 core when it comes to performance [cores]". So yes, should not be an 8 core all together, but like you said about 14 cores, or 12 with the 3D cache.

"And I doubt those cores have a higher clock."

I'm not sure what we're comparing them to. They should be capable of higher clock than the E cores. I thought all the AMD cores had the ability to hit the max frequency (but not necessarily at the same time). And some of the cores might not be able to take advantage of the 3D cache, but that doesn't limit their frequency, from my understanding.

It’s kind of funny and reminiscent of the AMD bulldozer days where they had a ton of cores compared to the contemporary Intel chips, especially at low/mid price points but the AMD chips were laughably underwhelming for single core performance which was even more important then.

I can’t speak to the Intel chips because I’ve been out of the Intel game for a long time but my 5700X3D does seem to happily run all cores at max clock speed.

> I'm not sure what we're comparing them to. They should be capable of higher clock than the E cores.

Oh, just higher clocked than the E cores. Yeah that's true, but if you're using that many cores at once you probably only care about total speed.

You said 12 core with higher clock versus 8, so I thought you were comparing to the performance cores.

> I thought all the AMD cores had the ability to hit the max frequency (but not necessarily at the same time).

The cores under the 3D cache have a notable clock penalty on existing CPUs.

> And some of the cores might not be able to take advantage of the 3D cache, but that doesn't limit their frequency, from my understanding.

Right, but my point is it's misleading to call out higher core count and the advantages of 3D stacking. The 3D stacking mostly benefits the cores it's on top of, which is 6-8 of them on existing CPUs.

"The cores under the 3D cache have a notable clock penalty on existing CPUs."

Interesting. I can't find any info on that. It seems that makes sense though since the 7900X is 50 TDP higher than the 7900X3D.

"Right, but my point is it's misleading to call out higher core count and the advantages of 3D stacking"

Yeah, that makes sense. I didn't realize there was a clock penalty on some of the cores with the 3D cache and that only some cores could use it.

It's due to the stacked cache being harder to cool and not supporting as high of a voltage. So the 3D CCD clocks lower, but for some workloads it's still faster (mainly ones dealing with large buffers, like games, most compute heavy benchmarks fit in normal caches and the non 3D V-Cache variants take the win).