Hacker News new | ask | show | jobs
by aw1621107 876 days ago
Yeah, the extra instructions from the division helped clue me into what might be going on, since the individual (f - b) and (e - b) calculations are visible in Clang's output.

I feel the division by sizeof(T) shouldn't matter that much, since the compiler knows it has pointers to T so I don't think the divisions would have remainders. I want to say pointer overflow and arithmetic on pointers to different objects (allocations?) should also be UB, so I suppose that might clear up most obstacles? I think I'm still missing something...

Does make me wonder how frequently this pattern might pop up elsewhere if it does turn out to be optimizable.

1 comments

My 1st paragraph was directly answering your 2nd paragraph here (starting from "This is certainly possible..." to the end). I was saying, compilers can optimize this if they want to, but it requires work to implement, and I can only guess (the reasons I listed) as to why they might not have done so yet.

> Does make me wonder how frequently this pattern might pop up elsewhere if it does turn out to be optimizable.

Probably a fair bit, but as I mentioned, it might break a lot of code too, because there's too much code in the wild doing illegal things with pointers (like shoving random state into the lower bits, etc.). Or not... the Clang folks would probably know better.

Right, I suppose my second paragraph was waffling on the extra legwork the compiler would have to do, but I know validating optimizations is just the start of the work.

Maybe this could be a good way to jump into messing with LLVM...

Out of curiosity, how much of a performance difference did you observe in practice when you made this optimization?

I don't recall unfortunately, it's been a few years.