|
|
|
|
|
by camel-cdr
98 days ago
|
|
> For example, should we use vrgather (with what LMUL), or interesting workarounds such as widening+slide1, to implement a basic operation such as interleaving two vectors? Use Zvzip, in the mean time: zip: vwmaccu.vx(vwaddu.vv(a, b), -1, b), or segmented load/store when you are touching memory anyways unzip: vsnrl trn1/trn2: masked vslide1up/vslide1down with even/odd mask The only thing base RVV does bad in those is register to register zip, which takes twice as many instructions as other ISAs. Zvzip gives you dedicated instructions of the above. |
|
Great that you did a gap analysis [1]. I'm curious if one of the inputs for that was the list of Highway ops [2]?
[1]: https://gist.github.com/camel-cdr/99a41367d6529f390d25e36ca3... [2]: https://github.com/google/highway/blob/master/g3doc/quick_re...