Hacker News new | ask | show | jobs
by tzs 1926 days ago
I wonder if order matters? That is, would mul followed by mulh be the same speed as mulh followed by mul?

How about if there is an instruction between them that does not do arithmetic? (What I'm wondering here is if the processor recognizes the specific two instruction sequence, or if it something more general like mul internally producing the full 128 bits, returning the lower 64, and caching the upper 64 bits somewhere so this if there is a mulh before something overwrites that cache it can use it).

1 comments

It seems like something that would be arbitrary depending on how the optimization was implemented. There wouldn't be an inherent need for that amount of generalization. Apple can tightly control their compiler to follow the rules, and there seemingly wouldn't be any compelling reason not to stick those two instructions back to back in a consistent order, since the second instruction is effectively free.

It would be fun to experiment with, for someone that has the hardware. My guess is that swapping the order will make it slower, but adding an independent instruction or two between them probably won't have a measureable effect. It would be fun to try and consistently interrupt the CPU between the two instructions as well somehow, to see if that short-circuits the optimization.