Have you considered our Highway library? Runtime dispatch need not be a PITA :) It's basically portable intrinsics, and a much more complete set (>300) than the ~50 in std.
I hadn't but it would make sense for doing my own personal programming challenges.
Given the ongoing disasters around the software supply chains I've been fighting the creeping NPM-ism that people are trying to introduce to C++, where you just FetchContent 20 different libraries to build your own app upon.
I do use gtest, fmt and a few others though, so something as broadly used as Highway would probably be fine by that standard as well. But I'd still like it better if there was a Good Enough solution that was part of C++ stdlib to reduce the number of external integrations that are deemed required for a modern C++ program.
Fair point. If it helps, our security team has called Highway critical infrastructure and helped to harden the repo.
The flip side of standardization is that it would be much harder and slower to add ops as the need arises, which we do regularly.
Yes, the EMU128 target is scalar only, with for loops. This is a fun way to see how well autovectorization works, with the same source code.
That works on any CPU. Curious which projects have such concerns, any link?
People reported challenges building V8 (whether upstream or the Node.js variant) on s390x with z13 support. I don't know if it was discussed on the porters mailing list because it's not public: https://groups.google.com/g/v8-s390-ports
Thanks for sharing. The first link seems non public indeed.
I can imagine there is some compile issue we could reasonably fix, with the help of someone who has Z13 access. Please encourage them to raise an issue. I will be back on May 26.
After that, it should at least be able to use the scalar fallback.
The issue with Z14 is that it lacks fp32 support. Would their usage be integer only?
I'll bring it up with some folks. It probably won't change much because the z13 transition has finished by now. It's still good to know because RISC-V is in the same boat regarding Highway support today: we need scalar fallback in Highway until we get RVA23 hardware deployed.
Any suggestions for improvement? We went through >5 iterations of the dispatching and I am fairly confident this is about as good as it gets in current C++.
I suppose "macro hell" is a matter of taste. Objectively, we have six dispatch related macros in the example: https://gcc.godbolt.org/z/KM3ben7E
The ~two dozen lines of boilerplate are generally copied from an example.
But why multi-file?
Given the ongoing disasters around the software supply chains I've been fighting the creeping NPM-ism that people are trying to introduce to C++, where you just FetchContent 20 different libraries to build your own app upon.
I do use gtest, fmt and a few others though, so something as broadly used as Highway would probably be fine by that standard as well. But I'd still like it better if there was a Good Enough solution that was part of C++ stdlib to reduce the number of external integrations that are deemed required for a modern C++ program.