You need to be aware that these invocations are going to blast your cores full throttle as you obviously don't have fine grained control. But as long as your data is easily parallelized on a vector with computations that don't depend on each other, it's a game changer. I use it all the time to multithread things with literally a single line of code.
I looked into it (my video, probably too long https://youtu.be/9oh66SF91LA?si=azDCSOAJKA9Gpzim), and the general result was that they make sense for non-small datasets and are a solid way to to parallelize something without having to pull in OpenMP or something.
They're only supported in MSVC and GCC (for the latter you need to link against Intel's TBB to make it work). Support in libc++ (Clang) is work in progress.
Clang does support parallel stl already (requires either TBB or OpenMP).
Our project https://github.com/elalish/manifold made use of this to speed up mesh processing algorithms a lot.