Hacker News new | ask | show | jobs
by woodruffw 1229 days ago
The longest structurally valid x86 instruction is 26 bytes, from some research I did a few years ago[1]. But as others have noted, structurally valid does not mean that any x86 CPU will accept them: they’ll all produce #UD or similar, including these 16 byte ones.

[1]: https://yossarian.net/res/pub/mishegos-langsec-2021.pdf

2 comments

You can just keep sticking prefixes on an instruction to get something longer, no? The processor will refuse to decode it but it's "legal" otherwise.
Ostensibly, only one prefix from each group can matter. Although I did notice that XACQUIRE and LOCK are both group 1 prefixes, which makes that statement kind of a lie (but it's an intentional design to make XACQUIRE do nothing on processors that don't support it).

In any case, there's only a finite number of prefixes you can meaningfully stick onto an instruction, and repeating the same prefix will do absolutely nothing.

Yep: in that sense the longest structurally valid x86 instruction is unbounded in length, so long as you conform to the group restrictions.
What is it? I can't seem to find it in the paper.
There are many structurally valid 26 byte productions; the diagram on page 2 shows the layout and page 5 has pseudocode for generating them.

(There may be even longer valid productions; my analysis was pretty naive. But 26 is already substantially longer than the limit!)

Thanks, I saw that, but I was hoping for explicit examples.
Figure 6 has examples (in the context of instruction generation). But note that those instructions are “nonsense,” in the sense that they’re guaranteed to either #UD at 15 bytes or well before then.
Ah, thanks, I missed that! Kind of disappointing that it doesn't give the instructions' dissassembly (so to speak -- obviously as you point out in reality they are nonsense), but I suppose I can sit down and figure that out myself. :)