Hacker News new | ask | show | jobs
by oopydoopy 2355 days ago
Not neccessarily. While the size of pointers and the size of general purpose registers does change between 32 and 64-bit architectures, there can be other significant differences.

For example, on X86_64, the ISA provides additional GPRs that aren't just 64-bit versions of the 32-bit register. For example, rax is the 64-bit version of eax, but IIRC you also have r0-7, for which there is no 32-bit equivalent. Furthermore, the ABI (at least on Windows) specifies a different function calling convention versus X86.

Additionally, I imagine there additional instructions/ops on x86_64 over x86. I dont know how else the cpu would distinguish add %eax, 1 from add %rax, 1, both of which are legal when the cpu is running in 64-bit mode.

I recall seeing a cool talk on how to confuse or crash many debuggers by doing something clever in assembly. The idea is you would write a block of polyglot 32 and 64-bit x86/64 assembly (i.e. binary that is both a valid x86 and x86_64 instruction sequence), switch the cpu from 32 to 64 bit mode at the end of the sequence, then branch back to the start of the block and reinterpret the same instructions as 64-bit rather than 32. You could use this technique to frustrate reverse engineering.

1 comments

But pretty much all of those differences get handled at the compiler level, right?
If you thought about portability when you were writing the code, then yes, you just flip a compiler switch and you're good to go.

A lot of the older 32bit software didn't ever consider the need to switch to 64bit in the future, so there is plenty of implicit assumptions made. These are sometimes very obvious (e.g. hand coded 32bit assembly), but sometimes very hard to detect, and can have a lot of logic built on top of it. For example, consider these two structures:

    struct A { int n; void *p; float f; };
    struct B { int a, b, c; };
Both structures are 12 byte large on 32bit, but A is 16 bytes on 64bit, because the pointer will be naturally aligned to 64bit boundary, so it will have 4 bytes of padding after n.

There's plenty of ways the code could assume that sizeof(A) == sizeof(B). For example, it could use memcmp() to verify the structures are equal, or it could be allocating them from the same slab allocator to minimize fragmentation. These kinds of bugs will not be caught by the compiler, and they might not be easily noticeable at runtime. It takes significant effort to port a, say, 20 year old codebase written for 32bit over to 64bit.

Yes, it's easy to say "that's bad code, you shouldn't have written it like that in the first place", but programming 20 years ago was drastically different than today, and those "hacks" could make or break a product back then. And, at the time some 32bit software was written, it was not even obvious what 64bit would look like, so it was not obvious how you'd prepare for it even if you wanted to.