http://thelaziestprogrammer.com/sharrington/math-of-machine-...
This looks at the algorithm the same way that the OP does (mathematically, visually, programmatically), only it applies it to a simpler Linear Regression model.