| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by monocasa 1584 days ago
	In practice the vast majority of MIPS code uses addu, the non trapping variant. And in x86 land there's the into instruction, interrupt if overflow bit set, so you're left with the same options.

1 comments

spc476 1584 days ago

Which has to be done after every instruction (http://boston.conman.org/2015/09/05.2) but it quite slow. Using a conditional jump after each instruction is faster than using INTO (http://boston.conman.org/2015/09/07.1).

link

colejohnson66 1584 days ago

My guess would be a pipelining issue where `INTO` isn't treated as a `Jcc`, but as an `INT` (mainly because it is an interrupt). Agner Fog's instruction tables[0] show (for the Pentium 4) `Jcc` takes one uOP with a throughput of 2-4. `INTO`, OTOH, when not taken uses four uOPs with a throughput of 18! Zen 3 is much better with a throughput of 2, but that's still worse than `JO raiseINTO`.

[0]: https://www.agner.org/optimize/instruction_tables.pdf

link

monocasa 1584 days ago

It's more complicated than shows up in micro benchmarks like that. Since when you do it, it's pretty much every add, you end up polluting your branch predictor by using jo instructions everywhere and it can lead to worse overall perf.

link