Hacker News new | ask | show | jobs
by phkahler 1773 days ago
My biggest complaint is that SIMD opens the door to doing proper vector math (dot product, cross product, vector addition, etc) in a single instruction. I'd like to see languages like C++ and Rust adopt 2,3 and 4-element vectors as first class types and use these registers and instructions for them.

I don't want to use the language features to define my onw vector types and figure out how to minimize any overhead in the implementation. I tried that years ago in C++ and ended up not using some of my overloaded operators because they were too slow. Some of that can be overcome with C++11 (I think) features, or others. Point is, vector math is so common I think they deserve first class support in languages.

6 comments

I'm not convinced that the "4-vector" is a very useful C++ concept. Sure, it easily maps to 4-wide SIMD registers, but is that really what you want?

Its clear to anyone who has tried it... that NVidia's CUDA / OpenCL / Intel ISPC approach is superior. Seeing the SIMD-lanes as a thread is easier to understand than expected.

NVidia CUDA and AMD ROCm/HIP are your C++ languages that compile into SIMD code. OpenCL isn't really C++ but kinda is associated with it. Intel is doing the OneAPI thing but I don't know much about it yet.

Python, Julia, and other high-level languages are also moving into the "simd-lanes as threads" approach. Its just fundamentally easier to think about.

> SIMD opens the door to doing proper vector math (dot product

dot product turns vectors into a scalar value, doing that for single vectors is a bad fit for SIMD. You can write SIMD code that efficiently computes multiple dot products in parallel but that needs a completely different vector layout.

Well, Rust is getting support for portable vectors (I'm one of the ones working on that): https://github.com/rust-lang/portable-simd
How about geometric algebra libraries like Klein and g3. Klein has operator overloading for the basis elements of the algebra implemented using intel's SSE, while g3 does the same but uses the Rust's portable stdsimd crate.

Klein: https://www.jeremyong.com/klein/ g3: https://github.com/wrnrlr/g3

Making Julia viable for your use case is probably a quicker path towards this goal than trying to persuade C++ and Rust to cater to the math audience.
alignment requirements for loading/storing wider registers makes this not worth the effort. simd instruction sets also generally dont have anything like dot or cross products.

shading languages do of course have first class vector types but all modern GPUs implement them with plain scalar code.