|
|
|
|
|
by userbinator
3341 days ago
|
|
They likely decode to the same sequence of uops internally; I benchmarked them and they're basically the same speed as the equivalent sequence of simpler operations, but a lot shorter. See this item for more information: https://news.ycombinator.com/item?id=8477254 However, they were much slower in the days of the P4 (which was really an oddity in x86 performance characteristics.) |
|