Your blog post makes an important mistake in that it recommends actually inverting the matrix X^T X, which one never actually does in this case. In general, A^{-1} b is sort of mathematical "slang" for "a solution to the linear system Ax = b", not "compute A^{-1} and multiply b by it".
This is discussed in many places, but here are a few examples: