Hacker News new | ask | show | jobs
by xgstation 858 days ago
This is interesting. The article mentioned Zen4c is architecture same to Zen4 but optimized for density running at lower frequency. Question here if anyone knows the answer: it seems like high frequency requires significantly more transistors? And is optimized for density also means less power consumption(assuming both zen4 and zen4c running at same frequency)?
6 comments

For a given process design kit (PDK), the synthesis tool will have a few different types of transistors. They correspond to different trade-offs between size and power on one hand, and speed on the other. The fastest the transistor, the bigger it is and the more it leaks (lower voltage threshold means faster switch time, but more leakage).

For a given target frequency, the synthesis tool will always use the most efficient transistors it can. And the result is a mix, using the few available types. But the highest the frequency, the higher the proportion of faster and bigger transistors in the mix.

This is the bird's eye view and very simplified, but hopefully enough to get the idea ;)

It's wild that I work on the "hardware" (kernel, cgroups, vfio, qemu) and know absolutely so little about what goes into building the actual hardware.

I think this post just enlightened me to the EE involved in chips than I've learned over 15 years.

It may require more transistors in some places, e.g. in longer buffer chains needed to drive greater capacitances at higher frequencies, but it requires mostly bigger transistors in many places.

According to AMD, both the big core and the small core use the same RTL design, but a different physical design, i.e. they use different libraries of standard cells (optimized either for high speed or for low area and low power consumption) and different layouts in the custom parts.

My understanding is that AMD approches the core count for multi thread / single or limited thread task at high frequency challenge in a very different way from Intel.

Intel goes with here are some real beefy cores who can do anything , here are some weaker core who can do only some task.

AMD goes here are half of the cores who can go real fast, here are half core who must remain slower, but everyone can do everything.

In theory, Intel could have better perf if optimized for, while AMD could have better perf with any generic random app out there... As long as the OS has enough hint to put the right app on the right core, and bothers to do it.

I think it's much less a philosophical difference and much more about what they had lying around. Intel had Atom core designs available to pair up with their desktop cores, and combining them into one chip was clearly a rush job rather than the plan from the start.

On the other hand, AMD only really has their Zen series of cores to use, but they rely more than Intel on automated layout tools so they can more easily port designs to a different fab process or do a second physical layout of the same architecture on the same process.

> Intel goes with here are some real beefy cores who can do anything , here are some weaker core who can do only some task.

This isnt true any more as of Intel's current CPUs (Meteor Lake). Both P and E cores support the same instruction set, including AVX10.

They don't require more transistors, they require bigger transistors. Ideally if transistor A is pushing a line with twice the capacitance attached compared to transistor B, transistor A would be twice as wide and so have twice the drive current of transistor B. But of course making transistors bigger increases the capacitive load of driving them[1]. So you solve for an equilibrium trading off the current to capacitance ratio against total chip size. And the Ryzen 4 versus 4c choose different ratios to optimize.

[1] Back in the day due to the intrinsic capacitance of the transistors themselves. These days more because bigger transistors are further apart leading to more line capacitance.

Is none of it based off of binning now, with sections of lower-performing chiplets or cores fused off to make the efficiency cores?
It is very rare to be able to fuse off part of a CPU core. Fusing off part of its cache is common, but other than that the only example that comes to mind is some server CPUs where Intel fused off the third vector unit.
Oh, right, this is part of a core not a whole core. AMD often fuses those off for lower part number SKUs.

FWIW Intel supposedly “fused off” avx-512 in alder lake though I don’t think that was what was actually done, physically speaking.

No, binning can't make cores physically smaller which is what AMD is doing.
High clock rates require smaller clock domains, where everything needs to happen in the same clock cycle. If you break the same logic into smaller clock domains, you need buffers between the domains. Zen4c significantly dropped the max frequency, so there are fewer clock domains and much less chip area spent on buffering transistors.

Otoh, modern power management involves clock gating --- turning off the clock in specific clock domains that aren't being used at the moment; having fewer clock domains makes that less granular and potentially less effective.

Other's points about individual transistors being smaller for a lower frequency design also applies. There may be other complementary benefits from lowering the frequency target too.

But note, it's not magic. The Zen4c server parts, where design area had been most disclosed, use a lot less space per core, and for L1 cache, but L2 and L3 cache take about the same area per byte as on Zen4.

Yeah, I think high frequency requires more transistors to do buffering of signals. Also reducing the cache speeds and size allows simpler, smaller designs to be used. Finally, reduced frequencies mean that you don't need as high a voltage to force signals to go to 0 or 1 quickly so you need less power. All of this gives zen4c lower power consumption at the same frequency.