Hacker News new | ask | show | jobs
by convery 1228 days ago
> You can add quite a few until you get to 15 bytes. This length is a hard limit on current x86-compatible CPUs. Any instruction longer than 15 bytes is considered invalid and will generate an exception.

There's a few valid 16 byte instructions though.. Sandpile lists a few examples: https://www.sandpile.org/x86/opc_enc.htm

  36 67 8F EA 78 10 84 24 disp32 imm32 =  bextr eax,[ss:esp*1+disp32],imm32
  64 67 8F EA F8 10 84 18 disp32 imm32 =  bextr rax,[fs:eax+ebx+disp32],imm32
4 comments

The longest structurally valid x86 instruction is 26 bytes, from some research I did a few years ago[1]. But as others have noted, structurally valid does not mean that any x86 CPU will accept them: they’ll all produce #UD or similar, including these 16 byte ones.

[1]: https://yossarian.net/res/pub/mishegos-langsec-2021.pdf

You can just keep sticking prefixes on an instruction to get something longer, no? The processor will refuse to decode it but it's "legal" otherwise.
Ostensibly, only one prefix from each group can matter. Although I did notice that XACQUIRE and LOCK are both group 1 prefixes, which makes that statement kind of a lie (but it's an intentional design to make XACQUIRE do nothing on processors that don't support it).

In any case, there's only a finite number of prefixes you can meaningfully stick onto an instruction, and repeating the same prefix will do absolutely nothing.

Yep: in that sense the longest structurally valid x86 instruction is unbounded in length, so long as you conform to the group restrictions.
What is it? I can't seem to find it in the paper.
There are many structurally valid 26 byte productions; the diagram on page 2 shows the layout and page 5 has pseudocode for generating them.

(There may be even longer valid productions; my analysis was pretty naive. But 26 is already substantially longer than the limit!)

Thanks, I saw that, but I was hoping for explicit examples.
Figure 6 has examples (in the context of instruction generation). But note that those instructions are “nonsense,” in the sense that they’re guaranteed to either #UD at 15 bytes or well before then.
Ah, thanks, I missed that! Kind of disappointing that it doesn't give the instructions' dissassembly (so to speak -- obviously as you point out in reality they are nonsense), but I suppose I can sit down and figure that out myself. :)
Valid meaning "valid according to spec" or "actually executable by processors in practice"?
Going by the linked webpage, the answer appears to be "valid if we ignore the limit, but in fact the limit is enforced and you'll get an exception".
Assuming this limit has even been the cause of a vulnerability before: https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg038...
That's really interesting. I had no idea there were structurally valid instructions that are longer than 15 bytes.