Hacker News new | ask | show | jobs
by kaslai 1885 days ago
I think I'd want to extend your definition a bit. I do think that it's reasonable to target a virtual machine with a so-called assembly language, however an assembly language must have an unambiguous bidirectional mapping between the target binary code and the assembly language. I'd also explicitly rule out any language that abstracts away things like which registers you're targeting, even if it does so in a deterministic fashion, as it removes control over which instructions you're actually emitting.

That is not to say that you can't have macros in an assembly language. While you can't unambiguously recover any macros (or label names, for that matter) used when generating assembly language from the binary output, you can still unambiguously recover some sort of assembly which can be used to generate an identical binary output, excepting any junk your particular platform does that prevents reproducible builds such as embedding timestamps.

1 comments

> assembly language must have an unambiguous bidirectional mapping between the target binary code and the assembly language.

This isn’t true for many CPU assembly languages, though. For example, the x86 instruction `add eax, ecx` can be assembled to either `01 c8` or `03 c1`.

> you can still unambiguously recover some sort of assembly which can be used to generate an identical binary output

If your binary contains `01 c8`, information is lost when that’s disassembled to `add eax, ecx` and there’s no guarantee that a newly assembled binary won’t contain `03 c1` in its place.

Hm, yeah I hadn't considered the possibility of the machine itself having aliased instructions. That's some interesting info, thank you.