|
|
|
|
|
by bollu
2214 days ago
|
|
Maybe it's just me, but I found matrix calculus atrocious. The rules aren't even useful for most situations, since they're essentially "hardcoded" for certain types of matrix situations that crop up often. I _much_ prefer to reduce a given matrix expression into einstein summation convention, at which point all of the "regular" calculus rules just work. You can bash it out from this point on. For example, consider the case of `x^T x`. We are told from matrix calculus that this is `2x`. To do this using summation convention, we first write it in terms of coordinates. We will have: y = xi xi [summation over i implicit]
dy/dxj
= d(xi^2)/dxj
= d(xi^2)/dxi * dxi/dxj [chain rule]
= 2xi delta(ij) [all xi independent, dxi/dxj = dirac]
= 2xj [summing over i]
dy/dx = 2x
|
|
It's when you take derivatives of vectors & matrices by other vectors & matrices that things get "interesting".