|
|
|
|
|
by MattyDub
1011 days ago
|
|
Can somebody explain why a square root is also considered a flop? Surely that involves more work than the other four operations the article listed. Is there some hardware algorithm for the square root that is as fast as (e.g.) division? |
|
Division and square root are generally slower than the other arithmetic operations, in both latency and throughput. They are finally partially pipelined in recent CPUs (a result every two or three cycles), but were totally unpipelined in mainstream designs for many years before that. A decade ago, they might take a few tens of cycles, now they’re generally somewhere around ten cycles latency on “real” CPUs, vs 3-5 cycles latency for the other floating point arithmetic instructions.