Adaptive Clocking in AMD’s Steamroller

Y	Hacker News new \| ask \| show \| jobs

	Adaptive Clocking in AMD’s Steamroller (realworldtech.com)
	44 points by synacksynack 4468 days ago

4 comments

sounds 4468 days ago

I wonder how heavily dI/dt events weighed on Intel in deciding to integrate the voltage regulator on-die in Haswell?

Tech Report [1] says the FIVR (Fully Integrated Voltage Regulator) is for:

• higher efficiency, lower voltage ripple

• cleaner power delivery [closer to the load, and controlled by Intel]

• very high frequency, to 140 MHz. This one seems relevant, reacting faster to dI/dt events

• the only apparent downside is that with more things in one package the thermal load is higher, but bear in mind haswell's very low overall power usage

[1] http://techreport.com/news/24802/leaked-slides-expose-haswel...

link

sitkack 4468 days ago

This is just the start. Next phase is to over clock while transistor error budget is within bounds, need to have heat and error sensors all over the die.

The other one is having functional instruction packet transactions that they can retry somewhere else if they fail during processing.

With these changes, the CPUs will always be operating at some pre determined error rate regime. No more over clocking, just change the AER (allowable error rate) register, which also will be a thing for simulations that don't matter like games and that do matter like Excel handling your payroll.

link

jwise0 4467 days ago

This is not so far from reality. Ben Zorn has done some work on this -- in particular, his work on "Flikker" might be interesting to you. The paper (http://research.microsoft.com/en-us/um/people/moscitho/publi...) was published at ASPLOS'11.

In general, this concept is called "good enough computing"; periodically, people think about it, and then brush it by the wayside. But it is a neat thought experiment, even if nothing else!

link

sitkack 4465 days ago

I have been toying with the design of a floating point processor that has configurable precision for each operation, but I don't yet know enough about the strict needs of numerical computation.

We already do this with SP, DP, EP, bignum, arbitrary precision and algorithms that are precision tolerant so I am not sure how much of an advantage it would have.

One idea I had would be to decompile a high performance benchmark and then synthesize microbenchmarks for groups of basic blocks to get instruction packet timing for various FP operations and then model the distribution in speedups from use lower precision math.

These papers look interesting

http://isl.korea.ac.kr/paper/TVLSI_May2004.pdf

http://passat.crhc.illinois.edu/rakeshk/dsn_13_cam.pdf

link

sanxiyn 4467 days ago

If you enjoy these stuffs, you will also enjoy Michael Carbin's work. Slides from OOPSLA'13:

http://people.csail.mit.edu/mcarbin/slides/oopsla13.pdf

link

sitkack 4466 days ago

I have been enjoying both of these immensely.

link

yoklov 4468 days ago

Sorry if I've misunderstood you, but I can't imagine many/any programs (even games) which function, well, at all if random transistor error is introduced.

link

oakwhiz 4467 days ago

If the errors are confined to specific operations and specific parts of a computer, such as a floating point unit, then in some cases a certain level of random error can be acceptable.

link

sitkack 4468 days ago

LSB bit flips or GPUs can be tolerant to errors.

link

nkurz 4467 days ago

6. Microarchitectural throttling to reduce current draw (e.g., Itanium processors issue fewer instructions during dI/dt events and vector units often take many cycles to ‘warm up’); this reduces IPC and can cause instruction scheduling challenges.

How literal is the 'warm up' for the vector units? I've occasionally seen this effect mentioned with regard to microbenchmarking, but never understood why this might be. Are vector units actually slowly activated over several cycles so as to reduce voltage droop?

link

zurn 4467 days ago

It's not literal even in Itanium's case, as it's about leveling the voltage droop and not temperature.

There can be many other kinds of run-ups to steady state happening in processors for microbenchmarks. Caches/TLBs, branch predictors, clock gating, macro scale voltage/frequency scaling, memory prefetching, power management in the system outside cpu, etc.

link

Omniusaspirer 4467 days ago

Not explicitly referring to the article, but it really saddens me that RWT is no longer going to get the extremely detailed articles it used to with the writers partial transition to the MPR. I'm happy for him, but at the same time I can't justify $800/yr towards good articles as someone who only follows the industry from a hobbyist perspective.

link

sliverstorm 4467 days ago

You could subscribe to MPR instead? I think that one is $1000/yr, a little more expensive but I understand it has greater coverage.

link