|
|
|
|
|
by FullyFunctional
781 days ago
|
|
The expansion of a 16-bit C insn to 32-bit isn't the problem. That part is trivial. The problem (and it is significant) is for a highly speculative superscalar machine that fetches 16+ instructions at a time but cannot tell the boundary of instructions until they are all decoded. Sure, it can be done, but that doesn't mean that it doesn't cost you in mispredict penalties (AKA IPC) and design/verification complexities that could have gone to performance. It is also true that burning up the encoding space for C means pain elsewhere. Example: branch and jump offsets are painfully small. So small that all non-toy code need to use a two instruction sequence to all call (and sometimes more). These problems don't show up on embedded processors and workloads. They matter for high performance. |
|
Not fully decoded though, since it's enough to look at the lower bits to determine instruction size.
> Sure, it can be done, but that doesn't mean that it doesn't cost you in mispredict penalties
What does decoding have to do with mispredict penalties?
> Example: branch and jump offsets are painfully small
Yes, thats what the 48 bit instruction encoding is for. See e.g. what the scalar eficiency SIG is currently working on: https://docs.google.com/spreadsheets/u/0/d/1dQYU7QQ-SnIoXp9v...