Hacker News new | ask | show | jobs
by jxors 596 days ago
Not a dumb question at all!

Documentation is definitely not one of x86's strengths. Other architectures do much better. For example, ARM provides formal models of their CPUs, and RISC-V is so simple you could implement all its semantics in a few thousand lines of code.

There are quite a few instructions with undefined behavior, but it is not that much of an issue if you can choose to avoid it -- for example in a compiler. Almost all UB is found in flags or when using invalid instruction prefixes. And although there is some unexpected UB, like `imul`'s zero flag being UB instead of being set according to the result of the multiplication [1], reading the manual and sticking to the parts that are clearly not UB gets you most of the way.

However, it becomes an issue if you need to analyze a binary that uses UB. Then you can't choose which instructions to use, so you need to have a complete model of all UB. That's much more difficult, and for example most decompilers currently fail at this. We have an example of this in Figure 1 of our paper.

[1]: https://explore.liblisa.nl/instruction/F7E8