Hacker News new | ask | show | jobs
by SilasX 3252 days ago
Correct me if I'm horribly misunderstanding [1], but isn't there a more general point here?

A CPU is, at root, a massive Boolean circuit wrapped in a flip-flop and some persisted state. The binary [sequence corresponding to an] opcode is just an input that determines which inputs go where.

Thus, for an n-bit opcode width, there are 2^n valid opcodes. Only m of them will correspond to intelligible, "I might want to use that some day" instructions. (Others, as you note, will be functionally equivalent versions of the m and ignorable as well.)

And so you will have 2^n - m "undocumented opcodes".

[1] based on reading NAND to Tetris

1 comments

> And so you will have 2^n - m "undocumented opcodes".

Not quite, there will be 2^n - m possible opcodes, but not all of them will have functionality attached. Many may end up being illegal.

So you could have a processor with m=400 and n=16, but no valid opcodes besides the 400. All 2^16 - 400, could throw an Illegal instruction exception.

So I must have some big misunderstanding then -- in what sense can a different binary input to a boolean circuit throw an illegal exception? How does the concept make sense at that level?
Processors have "traps" or "exceptions" which work kind of like interrupts. They are utterly unlike exceptions in say C++. They transfer control to another location you specify in an interrupt table. The most well known of these is the "page fault" which occurs when you write/read/fetch from memory you're not allowed to.

Say you have a userland process which dereferences a NULL pointer in C. In the CPUs page table the NULL virtual address is not mapped to a physical address, so a page fault occurs. Control transfers to the OS page fault handler which then delivers the SIGSEGV signal to the process.

There's also a division by zero exception, and several more. You can read more about them here: http://wiki.osdev.org/Exceptions

That much makes sense, but the original framing of the opcode as illegal still comes off as a category error to me (though perhaps that phrasing is common and excepted). The opcode itself isn't illegal, as opcodes are a CPU-level concept, where nothing is illegal.

Rather, the opcode may bring the memory state to something that might violate some OS's security model. But the opcode still does something to the CPU (Boolean) circuit state.

You could think of it as a switch case statement where the valid opcodes are the cases and there is a default that catches all non defined opcodes and throws an error. Its illegal in the sense you should never expect to run an instruction with those opcodes.