I'm using the old gnu toolchain branch from brucehoult and write the benchmarks such that they work in rvv 1.0 and rvv 0.7.1.
To help with that I've got a gnu assembler macro file that does 1 to 1 instruction translation when possible and for the other cases I need to #if/#else/#endif.
I'm working on rvv unicode conversions, and plan to release a blog post/article with benchmarks and explanation soon.
The problem is that I've uses c intrinsics to write it, so I need to manually translate the assembly code to rvv 0.7.1 compatible code.
Also, the board I've got access to (it might be an eval board, idk) has a bug in vredmax, so I also needed to adjust the code to not use that. (I've yet to investigate that further)
I was thinking about that, but matmul is so too hardware specific (cache sizes, and the like), and I'm not confident I can get an implementation that can get close to max performance.
have you run into any toolchain difficulties due to the 0.7.1 nature of the rvv?