Hacker News new | ask | show | jobs
by _chris_ 878 days ago
> L1i matters, people!

RISC-V consistently wins on L1i footprint.

The complaining is about number of dynamic instructions ("path length"), which can hit you if you don't fuse. Of course, path length might not actually be the bottleneck to raw performance, but it's an easy metric to argue, so a lot of people latch on to it.

2 comments

>The complaining is about number of dynamic instructions ("path length"), which can hit you if you don't fuse.

Ironically, RISC-V does great there[0]. Note this is despite these researchers did not even consider fusion.

0. https://dl.acm.org/doi/pdf/10.1145/3624062.3624233

Dunno about "great" - "For 6 out of 10 mini-app+compiler pairs, Arm has a shorter path length, with the overall average difference when weighting each benchmark equally being 2.3% longer for RISC-V."
While applying the worst possible reading to RISC-V, and despite not considering fusion, it is not worse than ARM.

That's awesome.

Isn't shorter path length the goal here? And ARM is better by both those metrics. Am I misunderstanding something?

ARM of course would also benefit from fusion too; but camel-cdr's mention of it being only rv64g is a pretty significant caveat.

Yes, shorter path is the goal.

No, winning 4 and losing 6, by a small margin, isn't "being worse than arm". The paper's authors even explicitly conclude it is not losing to ARM.

This is even ignoring whether code is within or outside loops, counting fuseable instructions as always non-fused, and not considering any instructions from extensions after 2019's ratified (actually unchanged from 2017) rv64g... any of those would have a favorable effect on RISC-V.

This is an excellent result for RISC-V, that clears any doubts in terms of path length. On top of what we already know about RISC-V leading in code density in 64bit.

Might not be "worse" (I'd definitely agree that the difference is plenty small enough to be considered equal within error bounds), but is certainly not something worthy of RISC-V being noted as doing "great" either.

Excluding extensions is perhaps a significant question, but, for example, Debian RISC-V currently targets rv64gc, which should have the same instruction counts as rv64g does, so software compiled for Debian can't use the later extensions for most code anyway. (never mind that ARMv8 also has excluded extensions, namely NEON, which is always present on ARMv8 and is not designed to be ignored)

And, of course, even being better than ARM is not equivalent to being the best it could be; ARMv8 isn't some attempt at a magical optimal instruction set, it's designed for whatever ARM needed, and that includes being able to efficiently share hardware with ARMv7 for backwards compatibility.

it's also targeting just rv64g
Right. Bitmanip would also, on its own, reduce instruction count considerably.
Also the difference in number of instructions on real programs is in the 10% range, which could well be compensated by other factors. For example, keeping to simpler instructions might well result in a 10% higher clock speed and lower silicon area too, equalising matters if not gaining an advantage.