Hacker News new | ask | show | jobs
by fyi1183 2971 days ago
It's possible I misunderstand you, but I don't think we're talking about the same thing.

With buses, you have different clocks and data moves between them. Like you said: CPU core 1 has its own clock, the bus between them has its own and different clock, and then CPU core 2 has its own clock which is yet again different. And in those cases you actually want different clocks, because you want to be able to boost CPUs independently from each other.

What I meant goes in another direction: instead of having a single powerful clock source for e.g. a CPU core, you have multiple smaller clock sources distributed throughout the core, but synchronized to each other so they run at the same frequency and phase. So data can move freely like it does today, but clock signals don't have to be distributed as far, which would hopefully make clock distribution easier and less power hungry.

It seems like such a thing should be possible, but perhaps there are good reasons why it isn't done?

1 comments

Two things:

1. Clocks don't use a lot of power. Think of a pendulum: there's a lot of movement but the energy constantly swings between gravitational potential energy and kinetic energy. Although there's lots of movement, the device uses very little energy. Similarly, a clock circuit (called an oscillator) barely uses any electricity: it mostly "Swings" energy back and forth between an inverter and a capacitor.

2. Distributing a clock over a long distance similarly uses very little power (!!) due to transmission line theory. You can effectively use the parasitic capacitance in wires themselves to effectively do this pendulum effect for efficient long-distance transmission of clocks. See: https://en.wikipedia.org/wiki/Transmission_line

This gif shows an animation of the pendulum effect in a longer-transmission line: https://upload.wikimedia.org/wikipedia/commons/8/89/Transmis...

----------------

I guess things could be de-sync'd for more efficiency. But your question is kind of like "Well, can't we get rid of V-Tables in C++ to make branch-prediction more efficient??"

I mean, we can. But V-Tables / Polymorphism really doesn't take a lot of time. We only do that if the performance gain really matters.

Interesting, thanks. I'll see if I can grok this from the link you gave.

I do have one follow-up question though: I was under the impression that clock trees contain repeaters in the form of CMOS inverters. Wouldn't those have dynamic leakage which the transmission line stuff doesn't account for?

I'm not really an expert at the VLSI level, I'm simply thinking from a PCB-perspective (and I just know that some of the same issues occur in the smaller chip-level design).

From my understanding: yes, the CMOS inverters will certainly use power. But you can minimize the use of them through some passive techniques.

Looking into the issue more, it does seem like a naive implementation of synchronized clocks can become costly. But at the same time, I'm seeing a number of research papers suggesting that people have been applying transmission-line techniques to the clock distribution problem.

I've always assumed that it was something that was commonly done at the chip level, but apparently not. These papers were published ~2010 or so.