Hacker News new | ask | show | jobs
by userbinator 3341 days ago
They likely decode to the same sequence of uops internally; I benchmarked them and they're basically the same speed as the equivalent sequence of simpler operations, but a lot shorter. See this item for more information:

https://news.ycombinator.com/item?id=8477254

However, they were much slower in the days of the P4 (which was really an oddity in x86 performance characteristics.)