Hacker News new | ask | show | jobs
by gary_0 695 days ago
I wonder if instead of having SMT, processors could briefly power off the unused ALUs/FPUs while waiting for something further up the pipeline, and focus on reducing heat and power consumption rather than maximizing utilization.
3 comments

They basically do: it's pretty common to clock gate inactive parts of the ALU, which reduces their power consumption greatly. Modern processor power usage is very workload-dependent for this reason.
I consider SMT a relic left over from the days when CPU design was all about performance per square millimeter. We are in the process of substituting that goal with that of performance per watt, or in the process of slowly realizing that our goals have shifted quite a while ago.

I really don't expect SMT to stay much longer. Even more so with timing visibility crosstalk issues lurking and big/small architectures offering more parallelism per chip area where single thread performance isn't in the spotlight. Or perhaps the marketing challenge of removing a feature that had once been the pride of the company is so big that SMT stays forever.

Intel is removing SMT from their next gen mobile processor.

My guess is this will help them improve ST perf. We will see how well it works, and if amd will follow

Could you, do they, put the “extra” LUs right next to the parts of the chip with the highest average thermal dissipation to even out the thermal load across the chip?

Or stack them vertically, so the least consistently used parts of the chip are farthest away from the heat sink, delaying throttling.

Intel has internal papers that investigated the use of the third dimension and the effect it would have on power consumption and performance. Of course it improves things, but it is very difficult to implement in the real world. The first real use of this technique by Intel is coming soon in the form of backside power delivery.

AMD 3D-Vcache technology shows that stacking an additional layer of transistors has a significant effect on thermal limits of a modern CPU. The extra cache is strategically placed over parts of the CPU die that use less power, yet those CPUs still have to run at lower temperatures and power settings compared to their non-vcache models. Just because you can build it doesn't mean that it will be a good fit for the mass market.