Hacker News new | ask | show | jobs
by morpheuskafka 1549 days ago
I remember last fall (when I got an M1 MBP for the first time, but the chip itself had been out for a year and sold through the school's laptop program) in a CS class I was the only one who could figure out how to make the JavaFX assignments work on M1. They would all load the GUI okay but crash as soon as anything was clicked.

The error message was from the native JDK code, so was not very useful, I just gave up and searched the bug tracker for "M1" until I found something that looked close. IIRC it was some weird error caused by code that was objectively wrong for several years, but the race/error condition had never been observed on an Intel machine, if I remember correctly. Thankfully it was in an EA build of OpenJDK, otherwise I probably would have given up and thrown up a VM in the cloud to run it in.

2 comments

I suspect a huge number of these kinds of bugs exist in production software, and will be brought to light by the "weak"(er) memory model of aarch64.
I'm curious, can you expand on aarch64's 'weaker memory model'?
There is more nuance to it than this, but basically in x86 all memory writes are available to all cores via main memory, whereas with aarch64 they do not.

On x86, a write by core A to memory will be available to core B if core B reads from main memory.

On aarch64, a write by core A will not immediately get published to main memory (will likely stay in cache (L1, L2, etc.), so even if core B tries to read from main memory it won't see the value from core A.

Ultimately aarch64's "weak"(er) memory model is more efficient as the programmer/compiler can make more efficient memory accesses. This results in fewer cache invalidations between cores. The problem in practice is that tons of production code has been written which assumes the x86 memory model. It may also just be a concurrency bug which doesn't manifest on x86 but does on aarch64 like in the post.

Again, this is a simplification of what happens but I think it illustrates the difference to some degree.

Here’s the info directly from ARM: https://developer.arm.com/documentation/den0024/a/Memory-Ord...

And here’s a post talking about it in the context of C++11: https://www.arangodb.com/2021/02/cpp-memory-model-migrating-...

That reminds me of an experience targeting x86 Windows and G3 Mac (IBM 750 PowerPC) with some C networking code (a thin client). Immediately I got a “bus error” on the Mac, even though it worked fine on the Penguin III. I found the problem was a misaligned memory access - a blatant mistake on my part - that the Penguin III just covered for somehow. You can read this as an example of the robustness principle, but I recall feeling I’d prefer the CPU just tell me something is wrong and not cover it up.
Wow “penguin III” thanks autocorrect