| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dtgriscom 472 days ago
	Are there any examples of using this for non-nefarious reasons? For instance, could I add new instructions that made some specific calculation faster?

3 comments

sirdarckcat 471 days ago

You can make a new instruction (or repurpose an existing one) that accesses physical memory bypassing the page walk, which would be faster. You can also make instructions that bypasses some checks (like privilege checks) and squeeze some tiny performance. Note this would introduce security issues though, so you could only use it on trusted software.

link

fc417fc802 471 days ago

Yes. It's been done before for Intel CPUs. https://misc0110.net/files/cpu_woot23.pdf

It's interesting to think about the sorts of things we could do if we had low level control over our hardware. Unfortunately things seem consistently headed in the opposite direction.

link

seanw444 471 days ago

Hopefully RISC-V changes things a bit.

link

colejohnson66 471 days ago

RISC-V wouldn’t help here at all. There’s nothing about RISC-V that prevents a CPU manufacturer putting in custom instructions and not documenting them.

link

seanw444 471 days ago

I understand that, but my sliver of hope is that since it's an open architecture, there will be manufacturers that make very hackable versions of it.

link

sweetjuly 471 days ago

It's not especially likely you'll be able to make things faster unless you're really strapped for fetch bandwidth. The reasoning is that most useful uops will already have instructions which directly decode to them (ie you're not going to get a faster load with ucode than if you just used a normal load instruction). In fact, given that ucode branches are (typically?) only statically predicted [1], you may well see worse performance if you need to use ucode branches with non-statically predictable directions.

CPU vendors typically gain performance when adding new instructions because they add new fancy uops. For example, x86 has AES instructions which lead to uops which (I imagine) exercise some hardware AES block. Vendors are not simply implementing AES in pure ucode as this wouldn't really gain any performance advantage over doing AES directly in software.

[1] https://arxiv.org/html/2501.12890v1

link