Yeah those algorithms are numerically tested, haven't changed in years, and really fast. You can certainly get to convenient matrix manipulation in C++, but then your compile times will increase significantly.
Fortran is a fully-vectorized and parallelized language at all levels, from instructions to shared and distributed parallelism (corresponding to one-sided MPI communications). This has been the case for at least 3 decades if not more. There are mature GPU support for Fortran. NVIDIA and PGI are currently implementing the standard parallel features of Fortran via GPUs.
not rhetorical, genuine question.