Hacker News new | ask | show | jobs
by pclmulqdq 22 days ago
It's way worse on RISC-V. There are maybe 5 x86 or ARM variants to care about at any given time, even if you want to hyper-optimize your code. RISC-V has a soup of literally 100s of extensions with non-uniform use and support.
3 comments

There are a lot more ARM extensions than people are aware of. E.g. debian uses ARMv8-A with FEAT_FP and FEAT_AdvSIMDas a base. Yes, floating-point and SIMD are optional in ARMv8-A, as are the following ISA extensions, only including ones that add instructions and excluding the AArch32 stuff: FEAT_CRC32, FEAT_AES, FEAT_PMULL, FEAT_SHA1, FEAT_SHA256, FEAT_RDM, FEAT_F32MM, FEAT_F64MM, FEAT_I8MM, FEAT_LSMAOC, FEAT_SHA3, FEAT_SHA512, , FEAT_SM3, FEAT_SM4, FEAT_SVE, FEAT_EPAC, FEAT_FCMA, FEAT_JSCVT, FEAT_LRCPC, FEAT_DotProd, FEAT_FHM, FEAT_FlagM, FEAT_LRCPC2, FEAT_BTI, FEAT_FRINTTS, FEAT_FlagM2, FEAT_MTE, FEAT_MTE2, FEAT_RNG, FEAT_SB, FEAT_BF16, FEAT_DGH, FEAT_EBF16, FEAT_CSSC, ...

Also fun: FEAT_LittleEnd, FEAT_MixedEnd, FEAT_BigEnd

All of that was just 64-bit ARMv8.x-a, there is a lot more stuff, once you go to R or M profiles, 32-bit and previous versions.

The reason this is mostly not a problem, is that distros converged on a minimum of 64-bit ARMv8-A + FP + SIMD, which will also happen with RVA23 for RISC-V.

Just for fun, here are the Zen4 ISA flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3 dnowprefetch osvw topoext perfctr_core ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves avx512_bf16 clzero xsaveerptr arat npt nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsave_vmload avx512vbmi umip avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid fsrm

Compared to RVA23 written out: rv64imafdcbv_zicsr_zicntr_zihpm_ziccif_ziccrse_ziccamoa_zicclsm_zic64b_za64rs_zihintpause_zba_zbb_zbs_zicbom_zicbop_zicboz_zfhmin_zkt_zvfhmin_zvbb_zvkt_zihintntl_zicond_zimop_zcmop_zcb_zfa_zawrs_svbare_svade_ssccptr_sstvecd_sstvala_sscounterenw_svpbmt_svinval_svnapot_sstc_sscofpmf_ssnpm_ssu64xl_sha_supm_zifencei

I will note that you listed out all of the RVA23 instruction extensions, not all of the blessed RISC-V instruction set extensions. Here's the list of every ratified RISC-V instruction set extension, to get parity with the list you gave for the other ISAs:

M, A, F, D, Q, C, B, H, Zicsr, Zifencei, Zicntr, Zihpm, Zihintpause, Zihintntl, Zicbom, Zicbop, Zicboz, Zicond, Zicfilp, Zicfiss, Zimop, Zca, Zcb, Zcd, Zce, Zcf, Zcmp, Zcmt, Zcmop, Zclsd, Zilsd, Zmmul, Zfh, Zfhmin, Zfa, Zfbfmin, Zfinx, Zdinx, Zhinx, Zhinxmin, Zaamo, Zalrsc, Zawrs, Zacas, Zabha, Zalasr, Zba, Zbb, Zbc, Zbs, Ztso, Zbkb, Zbkc, Zbkx, Zknd, Zkne, Zknh, Zksed, Zksh, Zkn, Zks, Zkt, Zk, Zkr, Zve32x, Zve32f, Zve64x, Zve64f, Zve64d, Zve, Zvl32b, Zvl64b, Zvl128b, Zvl256b, Zvl512b, Zvl1024b, Zvl, Zv, Zvfh, Zvfhmin, Zvfbfmin, Zvfbfwma, Zvbb, Zvbc, Zvkb, Zvkg, Zvkn, Zvknc, Zvkned, Zvkng, Zvknha, Zvknhb, Zvks, Zvksc, Zvksed, Zvksg, Zvksh, Zvkt, Sm1p11, Sm1p12, Sm1p13, Smaia, Smepmp, Smstateen, Smcdeleg, Smcsrind, Smcntrpmf, Smrnmi, Smdbltrp, Smmpm, Smnpm, Smctr, Ss1p11, Ss1p12, Ss1p13, Ssaia, Ssccfg, Sscsrind, Sscofpmf, Sstc, Ssqosid, Ssdbltrp, Ssnpm, Sspm, Ssctr, Supm, Sv32, Sv39, Sv48, Sv57, Svinval, Svnapot, Svpbmt, Svadu, Svvptc, Svrsw60t59b, Sdext, Sdtrig

That doesn't look very short to me.

These are grouped into profiles, like "Skylake" or "Cortex-M33" or "Neoverse-N1." The main issue for RISC-V isn't the number of instruction set extensions, it's the number of profiles. RVA23 is one single blessed profile, but many chips will add a few more instructions or include fewer than RVA23 based on age of the chip.

Common Linux distros will target one of the profiles, or a commonly supported subset like RV64GC.

Beyond that, what other extensions a particular board or chip supports, doesn't affect regular uses like web browsing. Specific apps or software libraries may use an ISA extension if present. Same as for other ISAs.

Code for embedded systems is optimized for the exact cpu in there. Same thing for highly specialized jobs (scientific / datacenter type stuff).

In short: yes, fragmentation wrt ISA extensions, hardware & software support exists. In practice, it isn't a big problem as some claim it to be.

That sure is a long list. But written out like that it gets a bit misleading: does there exist anything with that same list, just missing pae? mmx? syscall? Just because they have individual names & flags, doesn't mean every combination of them exists.
The Intel manuals list the set of features that are removed or planned to be removed from newer hardware versions: Sub-page write permissions for EPT, xAPIC mode, Key Locker, Uncore PMI. IA32_DEBUGCTL MSR, bit 13 (MSR address 1D9H), IntelĀ® Memory Protection Extensions (IntelĀ® MPX), MSR_TEST_CTRL, bit 31 (MSR address 33H), Hardware Lock Elision (HLE), VP2INTERSECT. AMD's manuals suggests that they view the ISA as purely additive, but I haven't read them in detail.

Basically, outside of MPX, and the confusing lineage of AVX-512 on client versus server parts, x86 is pretty strictly additive.

What are you imagining? If this is desktop then most of the extensions are going to be standard.

The only reason they're optional is because I'm using the same instruction set on my Pico, so no it doesn't have floating point, and I believe it has integer divide but I wouldn't be surprised if it didn't.

And the extensions are in groups, a good chunk of which are compressed instructions, which unless you're writing assembly, you don't need to worry about.

In fact most of this you don't need to worry about unless youre writing assembly.

Electronics distributors search engines tend to work extremely poorly and if you try to overload them with an absurd variety of niche extensions, then nobody is going to find the right RISC V MCU for their needs.
> There are maybe 5 x86 or ARM variants to care about at any given time

What? There are individual chips with nearly that many ARM variants, including incompatible ISAs (M0 vs R52) and compatible-but-very-different-performance-characteristics implementations of the same ISA (M4 vs M7, say). Even figuring out what portion of code can be shared across which cores (and for those that distinguish between ARM and Thumb mode, what mode that code can be called in), vs what code needs duplicate versions for different cores for correctness, vs what code needs duplicate versions for performance but not correctness (which changes as the code usage pattern evolves) can be a challenge on a single chip; I can't imagine a world where you can think about only five across an entire industry.