| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by bobmcnamara 413 days ago

It's a mix.

Better compiler support for RISC-V, but everything I've seen from them is a much shorter pipeline than the older Xtensa cores, so flash cache misses hit it harder.

Both RISC-V and Xtensa suffer from the lack of an ALU carry bit for the purposes of improving pipelining. But for these small cores it means 64-bit integer math usually takes a few more cycles than a Cortex-M Arm chip

2 comments

viraptor 413 days ago

But that also depends on what you use it for. If you're after the wifi and IO and other nice things for a mostly idle device - the pipeline is almost irrelevant. Esphome can run on older versions just fine too. On the other hand if you're doing something very optimised and need tight timing around interrupts to drive external hardware - it may matter a lot.

So... depends on the project.

link

fidotron 413 days ago

The Xtensa variants also come with dual core options, which means you can offload timing sensitive stuff to a dedicated core.

My playing with C3 betrayed that you have to use much larger buffers for things like i2s to make it work without glitching.

link

bobmcnamara 413 days ago

I also found splitting interrupts between the two cores helps with latency, but even if one core has only a single interrupt, that interrupt latency is increased compared to a single core system with a single interrupt. I suspect this is at least partly because they only put a single fetch pipe between the instruction cache and the crossbar.

link

bobmcnamara 413 days ago

Absolutely correct.

link

IshKebab 412 days ago

I think it would be hard to argue that an ALU carry bit was a good idea, even if 64-bit maths takes a few more cycles.

link

bobmcnamara 412 days ago

There's definitely a trade-off between things that seem relatively simple to ISA but can really complicate the pipeline.

Xtensa pays for it with crippled 64-bit performance, which has a lot of downstream impacts. Ex: division by a constant is also slower. Most compilers don't even bother fast pathing 64-bit division by a constant.

I was surprised to find Apple kept ADC/ADCS in aarch64. Maybe this ends up being one of those things that's less useful or potentially a bottleneck depending on the specific implementation. Edit: backwards compatibility probably.

The fact that a few cores have bolted it on to RISC-V makes me think I must not be alone in missing it.

link