I don't know the answer to that, but keep in mind that Apple is control of the entire CPU design.
They could for example put an x86 decoder in front of the ARM cores.
After all, modern Intel processors decode x86 to a simpler instruction set used internally anyway.
x86 has total store ordering, which requires to add barriers to make the order respected on Arm while doing emulation. On newer Arm chips, barriers are much cheaper, solving the problem.
Not only that, but Apple if they wanted to could strengthen the memory model of their custom chips, meaning the barriers wouldn't be necessary during emulation.
After all, modern Intel processors decode x86 to a simpler instruction set used internally anyway.