emulator would have a zero issue, if it's a direct transfer for assembly (not an emulator), it'd need either hardware support - e.g. apple chips, or memory barriers.
The differences between arm and x86 are known for 15y+, there is nothing new about it. Also concurrency support is one of the major benefits of languages with proper memory model - java started it with JMM[0]
The differences are very known, it's just still an open problem how to address running code for a stronger memory model on systems with a weaker one at as high as possible performance performance without explicit hardware support (like Apple's choice of a TSO config bit). Your compiled binary for TSO has erased any LoadLoad, LoadStore, and StoreStore barriers, and the emulator has to divine them. The heuristics there are still fraught with peril.
The JVM absolutely did some great work walking this path, both in defining a memory model in the first place, and supporting that model on weak and strong hardware memory models, but the JMM was specifically designed to be able to run cleanly on WMO platforms to begin with (early Sparc), so they don't face a lot of the same problems discussed here.
Any emulator that wants to be remotely performance competitive will do dynamic translation (i.e JIT). In fact ahead-of-time translation is not really feasible.
Memory models and JVM are not really relevant when discussing running binaries for a different architecture.
The memory models are relevant as the translation/JIT/whatever has to take them in consideration. TSO is well known and it's also well known how arm weaker memory model needs memory barriers to emulate TSO.
If there is a JIT I'd expect to be able to add a read barriers, on memory location allocated by another thread - incl. allocating bits in the pointers and masking them off on each dereference. If any block appears to be shared - the code that allocated it would need to be recompiled with memory store-store barriers; the reading part would need load-load and so on. There are quite a few ways to deal with the case, aside the obvious - make the hardware compatible.
If in end it's not an easy feat to make up for the stronger memory model correctness, yet correctness should be a prime goal of an 'emulator'
It is relevant as an example of how to write a JIT for a specific memory model that will run on a different architecture with a different memory model. AKA it is a known issue that has been successfully dealt with for quite a while now.
Ah well, back in the day, prior JMM, there was no notion of memory models at all, and no language had one; Hence, I referenced the original paper that started it all. The point was that it happened long time ago, and there is nothing new about the current case.
The differences between arm and x86 are known for 15y+, there is nothing new about it. Also concurrency support is one of the major benefits of languages with proper memory model - java started it with JMM[0]
[0]: https://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedL...