|
|
|
|
|
by rayiner
4711 days ago
|
|
For some reason I thought I read in the original question that he'd replaced the MOV with an equivalent string of NOOPs, but now that I read the example again I clearly just made that up in my head... In that case, I agree that it's probably an instruction alignment issue, specifically the MOV pushing some group of instructions to align better into the 16-byte fetch/decode window. It'd be interesting if someone can run the code on Sandy Bridge+ and see if the useless MOV still helps. The decoded u-op cache should take a lot of the instruction alignment issues off the table. |
|