It was purely software, but future developments of the idea (like NVIDIA Denver) wasn't purely software. However, the backend is necessarily tried to the ISA you are emulating (true for all of them, including SoftMachine). You need the data types to match (eg. Transmeta supported x87-style FP) to have any hope of performance. The universal machine is neither possible nor desirable (it would be inefficient).
These days the IMO most interesting ISA from a JIT point of view is WASM as it's the one that offers the most information (context) and the fewest constraints on the implementation.
These days the IMO most interesting ISA from a JIT point of view is WASM as it's the one that offers the most information (context) and the fewest constraints on the implementation.