Hacker News new | ask | show | jobs
by tentonova2 6007 days ago
For the same CPU architecture, compiler and FPU control word, you should expect the same results.

That's a stretch. Just because it's called "GCC" doesn't mean it behaves exactly the same on every platform, even if the CPU architecture is the same.

Apple makes extensive modifications, has their own ABI (see below), etc.

In a possibly related note, however, Apple has bizarre and unnecessarily strict alignment policies for Mac OS on x86:

It's not "bizarre and unnecessarily strict" if you want to be able to rely on SSE2+. Apple had the advantage of being able to define their ABI without regard to most legacy concerns, and so they did.

The reason that it's strictly enforced everywhere is that since Apple's compilers use SSE2+ they must be able to assume that, at function entry, the stack is properly (16 byte) aligned.

I understand your pain -- I've had to update a JIT implementation to deal with this, along with quite a bit of assembly that assumed 4 byte alignment, but Apple's reasoning makes sense.

See also: http://stackoverflow.com/questions/612443/why-does-the-mac-a...

1 comments

The stack is not actually aligned on function entry, because the return address is on top, so more alignment will be needed to avoid SSE2 locals being misaligned. It's not so hard for the callee side of the ABI to make sure the stack is aligned if it's going to use SSE2 and friends; it's rather more onerous to require every call site to make the alignments for the benefit of the callee.
The stack is not actually aligned on function entry, because the return address is on top, so more alignment will be needed to avoid SSE2 locals being misaligned.

The stack has --known alignment-- on entry, which removes the need to compute alignment at runtime. Any other approach requires more instructions overall.

It's not so hard for the callee side of the ABI to make sure the stack is aligned if it's going to use SSE2 and friends; it's rather more onerous to require every call site to make the alignments for the benefit of the callee.

I disagree that it's onerous. It seems silly to increase the runtime costs in exchange for a minutely simplified compiler port. It's not as if non-4-byte aligned ABIs are unusual.

But instead of aligning the stack in one location, the callee, now it needs to be aligned everywhere. It's pretty probable that's more instructions everywhere.

And it's not a "minutely simplified compiler port". That statement is startlingly naive. Do you have any idea how much hand-coded inline assembly, both in the runtime library and in customer code, needs to be carefully reviewed and modified to port from a platform without this requirement to one with it? Particularly since almost every other platform targeting the same architecture doesn't have the requirement?

But instead of aligning the stack in one location, the callee, now it needs to be aligned everywhere. It's pretty probable that's more instructions everywhere.

SSE2 is used everywhere. That's unlikely.

And it's not a "minutely simplified compiler port". That statement is startlingly naive. Do you have any idea how much hand-coded inline assembly, both in the runtime library and in customer code, needs to be carefully reviewed and modified to port from a platform without this requirement to one with it? Particularly since almost every other platform targeting the same architecture doesn't have the requirement?

Do you have any idea what the advantages are of being able to use SSE2+ everywhere? I find your position to be startling naive, especially given the fact that the vast majority of the existing Mac OS X developer base did not have any hand-coded inline assembly targeted at x86-32.

Other than game developers, how many legacy x86-32 developers is Apple genuinely interested in courting? Even for game developers (or JIT authors, or otherwise) with an overabundance of x86 4-byte-alignment-assuming assembly, fixing stack alignment is an annoying issue, not an impossible one.

Ah yes, Apple doesn't want any more developers for its platform. I forgot about that.
No, Apple made a perfectly sane business and technical decision to optimize for their users and existing developer base rather than a small subset of the non-Apple developer base who would have issue with 16-byte stack alignment.

The reasoning makes sense and I'd have done the same. I fixed our code and moved on.