Hacker News new | ask | show | jobs
by ZephyrBlu 2034 days ago
How do ops/s relate to clock speed, core count and architecture (x nm) then?
2 comments

Lets be even more abstract. What matters is useful work per second.

That metric is a function of clock speed, core count, lithography process, but also how many cycles each instruction takes to execute (e.g. how many cycles for an add), the instruction level parallelism in each core, branch prediction, memory architecture and caching, instruction set architecture (ISA) and its implementation, available cooling and more.

For example, by optimizing the number of cycles per instruction, increasing instruction level parallelism and branch prediction you can get a significant boost in performance, as witnessed by the massive jump[1] from the 486DX2 66Mhz to the Pentium 60 (also at 66Mhz in the video below).

[1]: https://www.youtube.com/watch?v=NLrKxWL73Mw

The simplest thing you can do to make a processor faster is to have faster memory. Usually this means bigger caches or wider memory buses.

You can decode more instructions at the same time. In theory there is no limit to how many instructions you can decode at once.

Those decoded instructions end up in a buffer and execution units can process them if there are no data dependencies. You can add as many execution units as you like if there is enough work for them.

You can avoid performance destroying events like pipeline stalls by predicting the execution flow of instructions. Better branch prediction means less performance is lost.

None of these have anything to do with clock speed.