|
|
|
|
|
by sweetjuly
477 days ago
|
|
It's not especially likely you'll be able to make things faster unless you're really strapped for fetch bandwidth. The reasoning is that most useful uops will already have instructions which directly decode to them (ie you're not going to get a faster load with ucode than if you just used a normal load instruction). In fact, given that ucode branches are (typically?) only statically predicted [1], you may well see worse performance if you need to use ucode branches with non-statically predictable directions. CPU vendors typically gain performance when adding new instructions because they add new fancy uops. For example, x86 has AES instructions which lead to uops which (I imagine) exercise some hardware AES block. Vendors are not simply implementing AES in pure ucode as this wouldn't really gain any performance advantage over doing AES directly in software. [1] https://arxiv.org/html/2501.12890v1 |
|