Hacker News new | ask | show | jobs
by brucehoult 24 days ago
The average bystander doesn't have to care, just buy a machine implementing the RVA23 profile (standard set of extensions) and be happy.

If you're building your own embedded hardware then you determine what your needs actually are e.g. do you need double precision? half precision? vector?. Then you choose a chip implementing that. Then you copy the ISA string from your chip's documentation to the `-march=` argument for GCC/Clang and be happy.

It's not hard and you don't have to think about it unless you very specifically want to.

2 comments

The average bystander might want to write high-performance code for their risc-v cpu. Then they must know precisely which instructions are available and what the performance implications of using them are. E.g., the difference between a shared and non-shared fp register file is huge.
For the "average bystander" they're going to buy an OS and compatible hardware, or if they're the average programmer they're going to use a compiler and libraries that solve the problem already for them. Very very few people need to worry about the details.
The average programmer still must inform their compiler what to use. Claude Code can assist with this.
You need Claude Code to copy a string into your config/make file?
I suspect the PC was being satirical, but yes, this is quite common now.
If you want to get the absolute most out of a specific CPU that is in your hands then you of course have to refer to the documentation for that specific CPU.

That process doesn't depend on whether it's an x86 or an Arm or a RISC-V.

That's why x86 people refer to the HUGE document maintained by Agner Fog.

If you want your code to run well on all standards-compliant implementations then you write according to the ISA documentation, in this case RVA23. Or ARMv9-A. Or x86_64 v3.

Nope. I want to get the most out of all cpus that will run my code. This is a combinatorial problem that grows exponentially by the number of relevant extensions. So, yes, you need to know the hardware, but accounting for combinations of 5 features is way easier than accounting for combinations of 10 features.

Riscv is basically repeating the same mistake X11 did. A minimal base that could be varied endlessly by combining extensions. I didn't work for X11. Some extensions became de facto mandatory (shm) while others fell by the wayside. But you could never rely on the availability of a given extension because someone somewhere might not have had it or disabled it. Then Wayland came along and cleaned up the gazillion extensions mis-design because it was a huge PITA. Riscv will get there too, sooner or later.

You think the average person writes performance optmized code?

If you are on that level then you know pretty well what you are targeting. And even then in 99% of cases you just look at the top level profile.

If you do performance analysis for some specific embeded project that is not using a standard profile, then its a bit more work, but hardly impossible.

Bruh, the "average person" won't buy a riscv-based computer in decades. The average bystander to the riscv project indeed writes high-performance code for their, so far, mostly non-existent or emulated riscv processors.
Your seriously arguing the the avg person write performance code so critical that minor difference in hardware implementation are relevant? Most people write code that isn't that performance critical, fireware or they are porting existing software over. A extreme minority of people that interact with an ISA is hand optimizing code.
Lol... the RISC-V ecosystem has loooong passed that stage. RISC-V is eating into markets from deeply embedded to automotive, high-end server cpu's to specialized accelerators. That's mass-produced hard silicon.

It's here to stay, coming to a device near you Real Soon Now (tm).

Do high-performance RISC-V CPUs (that you can actually buy) still exist? SiFive Unleashed was great but IIRC it was a single batch that never returned.
It depends on what you call "high performance".

I have in my hands one of the new SpacemiT K3 machines. It arrived today. I'm comparing it to several other things, and finding that it is pretty comparable to a "late 2012" Mac Mini with a i7-3720QM with base 2.6 GHz turbo 3.6 GHz running Ubuntu 24.04. They are quite close in feel for general use, web browsing, code editing, watching YouTube etc. The Mac is a little faster on many things, a LOT slower on others (anything that can use 8 cores, obviously).

You might say that's not "high performance" but we thought it was pretty good a dozen years ago.

The previous SpacemiT K1 chip two years ago was more like one of the last Pentium IIIs or PowerpC G4s, except with a lot more cores.

SpacemiT have a next generation K5 coming out, they say, at the end of the year. Tenstorrent have their new Ascalon-X core comparable to Apple's late 2020 M1 — and designed by the same guy who designed the M1. They've taped out a chip using that and say they'll be selling a dev board in Q2 or Q3. For now the first version is using an old chip process and it will be running at half the clock speed of the M1, but that's still going to be a very decent machine.

The HiFive Unleashed was of course 8 years ago. Since then there have been the HiFive Unmatched (roughly like Cortex A55) and the HiFive Premier P550 (a bit better than Cortex A72, other than no SIMD).

> You might say that's not "high performance" but we thought it was pretty good a dozen years ago.

Definitely sounds pretty high-performance compared to basically every RISC I've seen (and including nearly every cell phone I've ever owned with the exception of the Apple ones).

Tenstorrent is awesome, can't wait to see if I can afford any of their hardware in 5ish years. I miss when you could buy TPUs as a consumer (Coral etc.)

How does the average bystander buy an RVA23 machine today?
The Arace purchase link for the Jupiter 2 kit says it's “in stock“, but it's actually for a discount coupon. The actual system can only pre-ordered. The Sipeed web site does not say anything about shipping timelines, and the products are not offered in their AliExpress store. I think the Sipeed boards are in preorder, too.

Of course, neither of these are machines. And the average bystander probably isn't used to importing computer parts directly from China, either.

I have a machine in my hot little hands. It arrived today.

https://x.com/BruceHoult/status/2056911834737975635/photo/1

I've already posted on github my first project written on and for it today:

https://github.com/brucehoult/k3_ai

Sipeed have posted photos four days ago of the first batch of customer orders going out:

https://x.com/SipeedIO/status/2055549071931404291

> the average bystander probably isn't used to importing computer parts directly from China, either.

It won't take long for them to be available on amazon, just as D1 and JH7110 and K1 boards are now. e.g.

https://www.amazon.com/Orange-Pi-RV-Frequency-Development/dp...

Interesting. Is this a full system and not just a board? This is still not quite clear to me.

Hopefully, one of these systems gets produced in such large quantities that there's some pressure to add mainline Linux support.

Deep Computing have started taking orders for the final product and the Preorders are shipping within the next 6 weeks. They will be shipping from China I expect, but it's a proper shop front.

https://store.deepcomputing.io/products/dc-roma-risc-v-mainb...