Hacker News new | ask | show | jobs
by rwmj 765 days ago
I don't know, but we had a similar bug in OCaml, although in reverse.

Linux/x86-64 expects the stack to always be 16 byte aligned (although the ABI documentation at the time didn't make this assumption very clear). OCaml called into C with a non-aligned stack. GCC-generated code, assuming the stack was 16 byte aligned, used some strange Intel AVX instruction that only works on aligned data, unlike every other Intel instruction ever that can work on any alignment (albeit maybe more slowly).

This manifested itself as rare and totally unreproducible crashes (because stack alignment differed between runs). It was a bit of a nightmare to solve.

2 comments

The fact that msvc generates the unaligned loads for every avx instruction but gcc didn't gave me so many headaches. Most people worked on PC or Xbox and I was on the Playstation team. "oh boy, another one of these..."
Yes! It's one of those cases where when you've seen it before and know the catch with the instruction (probably vmovdqa) then you'll immediately recognise it. If you don't know it, it's very very mysterious. Why on earth Intel decided to make a handful of instructions require alignment is also a mystery to me.
Sweet mama speed. Although from what I understand it is more legacy speed cause you're losing all your time to fetching the memory anyway. But when processors were slower it was a meaningful amount.
The instruction wants to access one cache line, not two.
It makes more sense now that Intel and AMD retconned naturally-aligned 128-bit atomic loads into the ISA: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688 (AMD's confirmation is in comment 10.)
I hear these stories of black magic bugs and I look back at my 8 year career wondering if I'm even progressing as an engineer. Thrown from one studio to the next and never truly staying long enough to be trusted to investigate such issues.

I have no idea how engineers that started in the 10's or 20's are going to rise up to take over those fro the 90's/00's. So much is abstracted, but games specifically need to understand what's under the hood. Because they can and will hit some of the nastiest edge cases.

AFAIK you should only need 16-byte stack alignment if you use vector instructions, so Linux/x86-64 doesn't mandate it in all cases
The problem is that if you call another function, you won't know whether that function is uses any instructions that require alignment. So in practice, only leaf functions can skip stack alignment. The ABI states that the stack pointer must be aligned to a multiple of 16, plus 8, before any `call` instruction.